How to convert byte array to string and vice versa? – Dev

The best answers to the question “How to convert byte array to string and vice versa?” in the category Dev.

QUESTION:

I have to convert a byte array to string in Android, but my byte array contains negative values.

If I convert that string again to byte array, values I am getting are different from original byte array values.

What can I do to get proper conversion? Code I am using to do the conversion is as follows:

// Code to convert byte arr to str:
byte[] by_original = {0,1,-2,3,-4,-5,6};
String str1 = new String(by_original);
System.out.println("str1 >> "+str1);

// Code to convert str to byte arr:
byte[] by_new = str1.getBytes();
for(int i=0;i<by_new.length;i++) 
System.out.println("by1["+i+"] >> "+str1);

I am stuck in this problem.

ANSWER:

The “proper conversion” between byte[] and String is to explicitly state the encoding you want to use. If you start with a byte[] and it does not in fact contain text data, there is no “proper conversion”. Strings are for text, byte[] is for binary data, and the only really sensible thing to do is to avoid converting between them unless you absolutely have to.

If you really must use a String to hold binary data then the safest way is to use Base64 encoding.

ANSWER:

Your byte array must have some encoding. The encoding cannot be ASCII if you’ve got negative values. Once you figure that out, you can convert a set of bytes to a String using:

byte[] bytes = {...}
String str = new String(bytes, StandardCharsets.UTF_8); // for UTF-8 encoding

There are a bunch of encodings you can use, look at the supported encodings in the Oracle javadocs.

ANSWER:

We just need to construct a new String with the array: http://www.mkyong.com/java/how-do-convert-byte-array-to-string-in-java/

String s = new String(bytes);

The bytes of the resulting string differs depending on what charset you use. new String(bytes) and new String(bytes, Charset.forName(“utf-8”)) and new String(bytes, Charset.forName(“utf-16”)) will all have different byte arrays when you call String#getBytes() (depending on the default charset)

ANSWER:

The root problem is (I think) that you are unwittingly using a character set for which:

 bytes != encode(decode(bytes))

in some cases. UTF-8 is an example of such a character set. Specifically, certain sequences of bytes are not valid encodings in UTF-8. If the UTF-8 decoder encounters one of these sequences, it is liable to discard the offending bytes or decode them as the Unicode codepoint for “no such character”. Naturally, when you then try to encode the characters as bytes the result will be different.

The solution is:

  1. Be explicit about the character encoding you are using; i.e. use a String constructor and String.toByteArray method with an explicit charset.
  2. Use the right character set for your byte data … or alternatively one (such as “Latin-1” where all byte sequences map to valid Unicode characters.
  3. If your bytes are (really) binary data and you want to be able to transmit / receive them over a “text based” channel, use something like Base64 encoding … which is designed for this purpose.