Yes, I've noticed this. My (ugly hack of a) solution is to run
the following shell script as a cron job every night:
#!/bin/bash
for xml in *-2007.xml
do
cp ${xml} bak/${xml}
tr -cd '\011-\015[:print:]' < bak/${xml} > ${xml}
done
Any non-whitespace non-printable character gets deleted.