Backing Up SMS Messages on an Android 7.0 SmartPhone

My wife recently purchased a Samsung Galaxy Note 5 on the Internet; she didn't like it, so she decided to return it. My wife had still sent and received a lot of text messages during the couple of weeks she had the phone, so she wanted me to back up the sms messages, as well as everything else on the phone. I had Kies 3 on my computer (I use it backup my Galaxy Note 3), so I tried backing up my wife's phone with it. I received an error messages, telling me that I needed to use SmartSwitch. I didn't try downloading SmartSwitch; I just copied the folders on the Note 5 to my PC. The SMS messages I was able to backup to an XML file, using the options available on Android 7.0. I copied the SMS backup to my PC, but it didn't have an accompanying XSL or CSS file. The XML file showed up just fine in a browser, but the content had no formatting. The body of the SMS messages also had what looked like HTML entities, the date was some funny looking number, etc. My Wife was going to get another Note 4, so I probably couldn't just put the xml file on it, and expect it to display with any degree of friendliness. I had to find the appropriate XSL file, or write one, or write a program that would transform the XML file into a user friendly list of SMS messages.

I downloaded an app called SMS Backup & Restore. I got an error message when I tried loading my SMS backup into it. Apparently this program didn't create the same same format file as the Android 7.0 SMS backup function. The app did come with its own XSL file, so I linked that to my backup; the output was horrible looking. So I wrote a simple CSS file, and linked that to the XML file, and now the display was nice looking, but the CSS didn't perform any of the necessary data conversions (BTW, if you're wondering why I didn't just do another backup, using the app, my Wife had already returned her Note 5 by the time I got around to prettying up the backup file). Here's the CSS file:

message {display: table; border: solid 1px; margin: 5px 5px 5px 5px;}
address {display: block;}
body {display: block;}
date {display: block;}
read {display: block;}
typed {display: block;}
locked {display: block;}

The backup file is a typical looking xml file:

<?xml version='1.0' standalone='yes' ?>
<?xml-stylesheet type="text/css" href="sms.css"?>
<file ver="2">
  <thread n="42">
    <message type="SMS">
      <address>Phone Number</address>
      <body>SMS Message</body>
      <date>Epoch Time</date>
      <read>1</read>
      <type>5</type>
      <locked>0</locked>
    </message>

I still needed to transduce the content of the XML file. All of the words between the <body> and </body> tags were separated by '+' signs; got rid of those with my word processor. The rest of the body content, and every thing else I transformed using a Perl program:


use strict;
use warnings;

#The input to this program is the output of the sms backup on Android 7. The output of this program
# is a file that has the same conent as the input, but all the "HTML entities" have been coverted to
# readable characters.

# The Android backup of SMS messages creates an XML file, saves escaped characters as HTML Entities;
# not quite - HTML Entities are of the form '&#decimalNumber;', but these "entities" are of the
# form '%hexNumber'.
# The following is a hash table of "HTML entites" (of the above type) to be used as a lookup table.

my %entity = (  
              '%20' => ' ',   
              '%21' => '!',  
              '%22' => '"',  
              '%23' => '#',      
              '%24' => '$',     
              '%25' => '%',  
              '%26' => 'and',    # that should be an '&', but an 'and' is less work to display
              '%27' => "'", 
              '%28' => '(',  
              '%29' => ')',
              '%2A' => '*',
              '%2B' => '+',
              '%2C' => ',',
              '%2D' => '-',
              '%2E' => '.',
              '%2F' => '/',  
              '%3A' => ':',
              '%3B' => '',
              '%3C' => '<',
              '%3D' => '=',
              '%3E' => '>',
              '%3F' => '?',             
              '%40' => '@',
              '%5B' => '[',
              '%5C' => "\\",   
              '%5D' => ']',
              '%5E' => '^',
              '%5F' => '_',
              '%60' => '`',
              '%7B' => '{',
              '%7C' => '|',
              '%7D' => '}',
              '%7E' => '~'
             );

             my $line_of_text = "";
             my $outputFile = "xmlOutput.xml";
             my $escaped = '%[0-9a-fA-F]{2}';    #regex for html entity (of the type in this file)
             
             open(my $ih, '<', "sms-20180812200032.xml")
                or die "Failed to open up xml file\n";
             open (my $oh, '>', $outputFile)
                or die "Unable to create output file\n";
             while (<$ih>)
             {
                # read in xml file, line by line. Search for HTML entities and convert to type char
                my $temp = $_;
                my $datePos = index($temp, "<date>");
                print $datePos."\n";
                if ($datePos == 6)
                {
                   my $endPos = index($temp, "</date>");
                   my $epochTime = substr($temp, 12);
                   $epochTime /= 1000;
                   print "Epoch time is: ".$epochTime."\n";                  
                   my $localTime = scalar localtime($epochTime);
                   print "Local time is: ".$localTime."\n";
                   $temp = "<date>".$localTime."</date>";
                }

                $datePos = index($temp, "<type>");    # I decided to convert all the fields in
                print $datePos."\n";                             # the xml file. It's considered bad
                if ($datePos == 6)                              # programming practice to re-use variables
                {                                                        # for uses other then the original way they
                   my $epochTime = substr($temp, 12,1);  # were planned; too bad. $datePos is used
                   if ($epochTime == 1)                      # for every element; e.g., <date> and <type>
                   {
                      $temp = "<type>type: Received</type>"."\n";
                   }
                   elsif ($epochTime == 2)
                   {
                      $temp = "<type>type: Sent</type>"."\n";
                   }
                   elsif ($epochTime == 3)
                   {
                      $temp = "<type>type: Draft</type>"."\n";
                   }
                   elsif ($epochTime == 4)
                   {
                      $temp = "<type>type: Outbox</type>"."\n";
                   }
                   elsif ($epochTime == 5)
                   {
                      $temp = "<type>type: Failed</type>"."\n";
                   }
                   elsif ($epochTime == 6)
                   {
                      $temp = "<type>type: Queued</type>"."\n";
                   }
                   else
                   {
                      $temp = "<type>type: Unknown</type>"."\n";
                   }
                }

                $datePos = index($temp, "<locked>");
                print $datePos."\n";
                if ($datePos == 6)
                {                  
                   my $epochTime = substr($temp, 12,1);
                   if ($epochTime == 1)
                   {
                      $temp = "<locked>locked: Yes</locked>"."\n";
                   }
                   else
                   {
                      $temp = "<locked>locked: No</locked>"."\n";
                   }
                }
               
                $datePos = index($temp, "<read>");
                print $datePos."\n";
                if ($datePos == 6)
                {                  
                   my $epochTime = substr($temp, 12,1);
                   if ($epochTime == 1)
                   {
                      $temp = "<read>read: Yes</read>"."\n";
                   }
                   else
                   {
                      $temp = "<read>read: No</read>"."\n";
                   }
                }
               
                while ($temp =~ m/$escaped/g)                               
                {   
                   my $sub = $entity{"$&"};               
                   $temp =~ s/$&/$sub/g;
                   print "Got a match:   "."$&"."\n";
                   print "The ASCII character is: ".$sub."\n";
                   print "New string is :".$temp."\n";                 
                }
                print $oh $temp;
             }
             close $ih;
             close $oh;


Here's an actual message produced by the above program:

phone number
Poor David, he's still emptying the truck, I'm so lucky to have him. We'll pick up the vacuum when I see you.
Tue Jul 31 17:42:45 2018

read: Yes
type: Sent
locked: No

It's been over a year since I made an entry into my webpage. The last thing I was working on was my emotions detection program; need to get back to that now. I've been busy with all sorts of things over the last year, so I really didn't have much time for my webpage. I just moved, so once I get settled down, I should be able to get back into the swing of my normal life.


Return To My Blog Page       Return To My Programming Page