Programming

Update the encoding type of file

Stress character or a diacritical mark is a glyph—or symbol—added to a letter. Example : afé, cliché, ôperàtor, Chloë, Brontë, coöperate, naïve. If the file has stress characters contents  and if we are trying to import the contents of the file or display the content then the content might not be imported or displayed as it is supposed to be. In such case,

  1.  Open the file in the notepad, Check the encoding type of your file,
  2.  Convert the encoding to UTF-8 and save the file
  3.  Try to reimport the file/display the content of file. All the data in the file is imported without any issue and stress characters contents are displayed without any issue.

Every time updating the encoding type when new file is uploaded might be hassle. To make it ease, we are writing piece of code that reads all the strings in file, check the encoding type and if encoding type is other than UTF-8, convert encoding type to UTF-8 .

Let’s begin…

  1. $content=file_get_contents(‘files/upload.txt’);
    file_get_contents(),read the contents of a file into a string
  2. $currentEncoding = mb_detect_encoding($content, “UTF-8, ISO-8859-1, ISO-8859-15”, true);
    mb_detect_encoding —Detects character encoding in string
  3. $content=mb_convert_encoding($content, ‘UTF-8’, $currentEncoding);
    if ($currentEncoding!=’UTF-8′) $content=mb_convert_encoding($content, ‘UTF-8’, $currentEncoding); mb_convert_encoding — Convert character encoding . Here, if the string value other than UTF-8, we are converting all the string values to UTF-8
  4. file_put_contents($fileSource, $content);
    If you want to save the encoded strings to a new file update your file source, otherwise, existing source file can be replaced.
    $targetSource = ”files/encodes-upload.txt’;
    file_put_contents($targetSource, $content);The file_put_contents() function writes a string to a file. It returns the number of bytes that were written in the file on SUCCESS, or FALSE on failure. if the target source file doesn’t exist, the file is created. Otherwise, the existing file is overwritten.

Now, let’s gather all the codes in a single function:

Function to replace the existing file by UTF-8 encoding type.

Function to update and create new file with UTF-8 encoding type.

AND, WE ARE DONE!!

P.S. Play with the code as per the client requirement.

 

Hits: 24

Standard