[diffoscope] 03/03: Read lines using an iterator instead of loading a full list in memory
Jérémy Bobbio
lunar at moszumanska.debian.org
Tue Dec 22 18:22:38 CET 2015
This is an automated email from the git hooks/post-receive script.
lunar pushed a commit to branch master
in repository diffoscope.
commit f0666e4e743f2ef10c008cb4328a6c05a3bd8770
Author: Jérémy Bobbio <lunar at debian.org>
Date: Tue Dec 22 13:59:28 2015 +0000
Read lines using an iterator instead of loading a full list in memory
The StreamReader.readlines() creates a list instead of an iterator. This
meant that we were previously loading the entire content fead to diff
in memory instead of streaming it as it was produced. This meant we
were creating huge buffers for no reasons!
So let's replace all `for line in f.readlines():` by `for line in f:` which
will properly use an iterator for the same end result.
Thanks Mike Hommey for the report and good test case.
Closes: #808120
---
diffoscope/comparators/deb.py | 2 +-
diffoscope/difference.py | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/diffoscope/comparators/deb.py b/diffoscope/comparators/deb.py
index a6af031..733b8b4 100644
--- a/diffoscope/comparators/deb.py
+++ b/diffoscope/comparators/deb.py
@@ -71,7 +71,7 @@ class Md5sumsFile(File):
try:
md5sums = {}
with open(self.path, 'r', encoding='utf-8') as f:
- for line in f.readlines():
+ for line in f:
md5sum, path = re.split(r'\s+', line.strip(), maxsplit=1)
md5sums['./%s' % path] = md5sum
return md5sums
diff --git a/diffoscope/difference.py b/diffoscope/difference.py
index 4a915dc..d6d49e1 100644
--- a/diffoscope/difference.py
+++ b/diffoscope/difference.py
@@ -56,7 +56,7 @@ class DiffParser(object):
return self._success
def parse(self):
- for line in self._output.readlines():
+ for line in self._output:
self._action = self._action(line.decode('utf-8', errors='replace'))
self._action('')
self._success = True
@@ -226,7 +226,7 @@ def make_feeder_from_raw_reader(in_file, filter=lambda buf: buf):
def feeder(out_file):
line_count = 0
end_nl = False
- for buf in in_file.readlines():
+ for buf in in_file:
line_count += 1
out_file.write(filter(buf))
max_lines = Config.general.max_diff_input_lines
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/reproducible/diffoscope.git
More information about the diffoscope
mailing list