[Git][reproducible-builds/diffoscope][master] Detect XML files as XML files if file(1) claims if they are XML files or if...

Chris Lamb (@lamby) gitlab at salsa.debian.org
Thu Nov 11 16:38:54 UTC 2021



Chris Lamb pushed to branch master at Reproducible Builds / diffoscope


Commits:
615d177d by Chris Lamb at 2021-11-11T08:35:41-08:00
Detect XML files as XML files if file(1) claims if they are XML files or if they are named .xml. (Closes: reproducible-builds/diffoscope#287, Debian:#999438)

- - - - -


1 changed file:

- diffoscope/comparators/xml.py


Changes:

=====================================
diffoscope/comparators/xml.py
=====================================
@@ -2,7 +2,7 @@
 # diffoscope: in-depth comparison of files, archives, and directories
 #
 # Copyright © 2017 Juliana Rodrigues <juliana.orod at gmail.com>
-# Copyright © 2017-2020 Chris Lamb <lamby at debian.org>
+# Copyright © 2017-2021 Chris Lamb <lamby at debian.org>
 #
 # diffoscope is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@@ -17,6 +17,8 @@
 # You should have received a copy of the GNU General Public License
 # along with diffoscope.  If not, see <https://www.gnu.org/licenses/>.
 
+import re
+
 from xml.parsers.expat import ExpatError
 
 from diffoscope.comparators.utils.file import File
@@ -83,7 +85,7 @@ class XMLFile(File):
     """
 
     DESCRIPTION = "XML files"
-    FILE_EXTENSION_SUFFIX = {".xml"}
+    FILE_TYPE_RE = re.compile(r"^XML \S+ document")
 
     @classmethod
     def recognizes(cls, file):
@@ -96,7 +98,8 @@ class XMLFile(File):
         Returns:
             False if file is not a XML File, True otherwise
         """
-        if not super().recognizes(file):
+
+        if not super().recognizes(file) and not file.name.endswith(".xml"):
             return False
 
         with open(file.path) as f:



View it on GitLab: https://salsa.debian.org/reproducible-builds/diffoscope/-/commit/615d177ddbedca3787437b25d8446e335b8d26ad

-- 
View it on GitLab: https://salsa.debian.org/reproducible-builds/diffoscope/-/commit/615d177ddbedca3787437b25d8446e335b8d26ad
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.reproducible-builds.org/pipermail/rb-commits/attachments/20211111/74f42ce3/attachment.htm>


More information about the rb-commits mailing list