[diffoscope] 03/03: Read lines using an iterator instead of loading a full list in memory

Jérémy Bobbio lunar at moszumanska.debian.org
Tue Dec 22 18:22:38 CET 2015

This is an automated email from the git hooks/post-receive script.

lunar pushed a commit to branch master
in repository diffoscope.

commit f0666e4e743f2ef10c008cb4328a6c05a3bd8770
Author: Jérémy Bobbio <lunar at debian.org>
Date:   Tue Dec 22 13:59:28 2015 +0000

    Read lines using an iterator instead of loading a full list in memory
    The StreamReader.readlines() creates a list instead of an iterator. This
    meant that we were previously loading the entire content fead to diff
    in memory instead of streaming it as it was produced. This meant we
    were creating huge buffers for no reasons!
    So let's replace all `for line in f.readlines():` by `for line in f:` which
    will properly use an iterator for the same end result.
    Thanks Mike Hommey for the report and good test case.
    Closes: #808120
 diffoscope/comparators/deb.py | 2 +-
 diffoscope/difference.py      | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/diffoscope/comparators/deb.py b/diffoscope/comparators/deb.py
index a6af031..733b8b4 100644
--- a/diffoscope/comparators/deb.py
+++ b/diffoscope/comparators/deb.py
@@ -71,7 +71,7 @@ class Md5sumsFile(File):
             md5sums = {}
             with open(self.path, 'r', encoding='utf-8') as f:
-                for line in f.readlines():
+                for line in f:
                     md5sum, path = re.split(r'\s+', line.strip(), maxsplit=1)
                     md5sums['./%s' % path] = md5sum
             return md5sums
diff --git a/diffoscope/difference.py b/diffoscope/difference.py
index 4a915dc..d6d49e1 100644
--- a/diffoscope/difference.py
+++ b/diffoscope/difference.py
@@ -56,7 +56,7 @@ class DiffParser(object):
         return self._success
     def parse(self):
-        for line in self._output.readlines():
+        for line in self._output:
             self._action = self._action(line.decode('utf-8', errors='replace'))
         self._success = True
@@ -226,7 +226,7 @@ def make_feeder_from_raw_reader(in_file, filter=lambda buf: buf):
     def feeder(out_file):
         line_count = 0
         end_nl = False
-        for buf in in_file.readlines():
+        for buf in in_file:
             line_count += 1
             max_lines = Config.general.max_diff_input_lines

Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/reproducible/diffoscope.git

More information about the diffoscope mailing list