Monthly Archives: February 2016

Parsing a CSV with JavaScript

I had a question from a student as to parsing a CSV file with JavaScript – not jQuery, not anything else, just JavaScript.  Easy, right?  Should be if you’ve worked with files and JavaScript before.  I hadn’t done so at the time, so it served as a bit of a challenge, and in a good way.

One caveat on the code in this post:  It’s ugly.  I’m using an inline “onsubmit” event handler for the form, and I hate myself for doing so.  It’s also not optimized in any way but is more Proof of Concept than anything.  If you’re going to use this in a production environment, first fix that event handler and then clean the code up and include error checking/handling.  I also don’t know how well this would perform with a large CSV file.

Speaking of CSV, the code assumes a CSV file that contains no other commas other than those separating the actual values.  Here’s the sample that I used:

City,Temperature,Condition 
Stevens Point,41,Sunny
Chicago,54,Sunny
Montreal,45,Cloudy 
Halifax,50,Rain

As a side note, I want to make it back to Halifax once when it’s not raining.

Build an HTML Page

Let’s build an HTML page to grab the file.  The HTML is simple, just a form with an input type of “file” and a submit button.  The HTML also features a <table> element so that I can dump the resulting contents of the CSV out to the screen.

<!doctype html>
<html>
<head><title>CSV</title></head>
<body>
<form onsubmit="return processFile();" action="#" name="myForm" id="aForm" method="POST">
<input type="file" id="myFile" name="myFile"><br>
<input type="submit" name="submitMe" value="Process File">
</form>
<section>
<table id="myTable"></table>
</section>
</body>
</html>

JavaScript CSV

Next up is the JavaScript.  The form makes an array of files available when retrieved.  So:

var theFile = document.getElementById("myFile").files[0];

Now “theFile” contains the actual file as uploaded.  Next, some minimal error checking to see if theFile is actually something.  If it is, then a couple variables are initialized and set for later use:

var table = document.getElementById("myTable");
 var headerLine = "";

And then the key bit:  A FileReader() object is instantiated:

var myReader = new FileReader();

A function is attached to the onload event of the myReader FileReader.  This function is where the magic happens:

 myReader.onload = function(e) {
   var content = myReader.result;
   var lines = content.split("\r");
   for (var count = 0; count < lines.length; count++) {
     var row = document.createElement("tr");
     var rowContent = lines[count].split(",");
       for (var i = 0; i < rowContent.length; i++) {
         if (count == 0) {
           var cellElement = document.createElement("th");
         } else {
           var cellElement = document.createElement("td");
         }
         var cellContent = document.createTextNode(rowContent[i]);
         cellElement.appendChild(cellContent);
         row.appendChild(cellElement);
       }  //end rowContent for loop
       myTable.appendChild(row);
     } //end main for loop
   }  //end onload function 
   myReader.readAsText(theFile);
 }  //end if(theFile)

Actually, the magic begins outside of the onload function with the line

myReader.readAsText(theFile);

When this line executes, then the onload function is fired for the FileReader object.  The first line within the onload function then gathers the contents of the file into a variable called ‘content’.  The content is then split along Return characters (\r).  So now we have variable that contains the CSV line-by-line:

   var content = myReader.result;
   var lines = content.split("\r");

Next, a for loop is entered.  This for loop creates a new table row (tr) for each line in the CSV:

     var row = document.createElement("tr");

The contents of the row are then split at commas:

 var rowContent = lines[count].split(",");

The contents of each row (in the rowContent variable) are then looped in the next for loop.  If it’s the first line of the CSV then we assume it contains heading values and therefore make a “th” element.  Otherwise simple “td” elements are created for each cell in the table:

         if (count == 0) {
           var cellElement = document.createElement("th");
         } else {
           var cellElement = document.createElement("td");
         }

Next, the code creates text nodes for each bit of content, appends those text nodes to the row and then appends the table row to the HTML table.

         var cellContent = document.createTextNode(rowContent[i]);
         cellElement.appendChild(cellContent);
         row.appendChild(cellElement);
       }  //end rowContent for loop
       myTable.appendChild(row);
     } //end main for loop

Finally, the code does a return false so that the form isn’t actually submitted.

Here’s the full code, with in-page JavaScript:

<!doctype html>
<html>
<head><title>CSV</title></head>
<body>
<script type="text/javascript">
function processFile() {
 var fileSize = 0;
 var theFile = document.getElementById("myFile").files[0];
 if (theFile) {
 var table = document.getElementById("myTable");
 var headerLine = "";
 var myReader = new FileReader();
 myReader.onload = function(e) {
 var content = myReader.result;
 var lines = content.split("\r");
 for (var count = 0; count < lines.length; count++) {
 var row = document.createElement("tr");
 var rowContent = lines[count].split(",");
 for (var i = 0; i < rowContent.length; i++) {
 if (count == 0) {
 var cellElement = document.createElement("th");
 } else {
 var cellElement = document.createElement("td");
 }
 var cellContent = document.createTextNode(rowContent[i]);
 cellElement.appendChild(cellContent);
 row.appendChild(cellElement);
 }
 myTable.appendChild(row);
 }
 }
 myReader.readAsText(theFile);
 }
 return false;
}
</script>
<form onsubmit="return processFile();" action="#" name="myForm" id="aForm" method="POST">
<input type="file" id="myFile" name="myFile"><br>
<input type="submit" name="submitMe" value="Process File">
</form>
<section>
<table id="myTable"></table>
</section>
</body>
</html>