1. We're looking for feedback on Unity Starter Kits! Let us know what you’d like.
    Dismiss Notice
  2. Unity 2017.2 beta is now available for download.
    Dismiss Notice
  3. Unity 2017.1 is now released.
    Dismiss Notice
  4. Introducing the Unity Essentials Packs! Find out more.
    Dismiss Notice
  5. Check out all the fixes for 5.6 on the patch releases page.
    Dismiss Notice
  6. Help us improve the editor usability and artist workflows. Join our discussion to provide your feedback.
    Dismiss Notice

Bug with StringReader speed? Or am I missing something?

Discussion in 'Scripting' started by johot, Sep 27, 2011.

  1. johot

    johot

    Joined:
    Apr 11, 2011
    Posts:
    200
    So I have a big textfile with words that I need to parse. To do this I have them all in a TextAsset, it's about 500 000 lines of words, about 8 mb.

    Now I want to parse this file, line by line.

    I've tried three different ways.

    1. String.Split - this is really slow and I can't use it... Using the text property of the text asset here (so a big string).

    2. StreamReader using a MemoryStream and using the bytes property of the textasset (so byte[]). This is pretty fast, about 10x as fast as method 1.

    3. StringReader... this never even finishes. Do we have a bug here?

    Have anyone any experience with the StringReader? Is there a bug in it or something? Because it's really slow! I would think it would be about as fast as nr 2?

    Why would I think it is a bug then? Because if I loop through the characters of the file (using the textasset text property) and using a StringBuilder and basically simulating what the StringReader should do (or so I think anyway) I get performance about the same as method 2.

    You can test this easily by creating a big textasset file and then trying to read it line by line using a StringReader.


    Code (csharp):
    1.  
    2. StringReader reader = new StringReader(aVeryLongString);
    3. string line;
    4.            
    5. while ((line = reader.ReadLine()) != null) {
    6. }
    7.  
    8.  
     
  2. EddieCam

    EddieCam

    Joined:
    Oct 28, 2009
    Posts:
    25
    Old thread, but for anyone else googling, there is a bug with Mono's current StringReader.ReadLine method, as detailed here

    Basically, if a string is using Unix line endings it will search the entire string each time you call ReadLine, to look for Windows line endings too. O(n^2)!

    As Johot says, you may want to write your own ReadLine type method using stringbuilder, for any text file with >1000 lines.
     
  3. made-on-jupiter

    made-on-jupiter

    Joined:
    May 19, 2013
    Posts:
    25
    Still seems an issue with Mono 4.0.1.

    If all line endings in the string are of the same type, this will do a quick job at replacing Unix line endings with Windows ones (but leave Windows ones be):


    if (Regex.Match(text, "\r?\n").Length == 1)
    text = text.Replace("\n", "\r\n");