Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ansi2html.sh dies on specific input. #17

Open
John-Schlick opened this issue Oct 8, 2014 · 7 comments
Open

ansi2html.sh dies on specific input. #17

John-Schlick opened this issue Oct 8, 2014 · 7 comments

Comments

@John-Schlick
Copy link

Given the git diff file below, ansi2html (the newest version pulled straight from the repo) will generate output that stops midway thru the file.
Given that the "á" shows up in the output as a "�" I suspect that this is somehow matters.

The HTML stops on this line:
informe. Una agencia investigadora de informes de crédito deber�°

========= Input file =======================
�[1mdiff --git a/classes/screening/report/ui.class.php b/classes/screening/report/ui.class.php�[m
�[1mindex 3476082..7fbb43d 100755�[m
�[1m--- a/classes/screening/report/ui.class.php�[m
�[1m+++ b/classes/screening/report/ui.class.php�[m
�[36m@@ -319,6 +319,14 @@�[m �[mclass screening_Report_UI�[m
}�[m
break;�[m
�[m
�[32m+�[m�[32m // Decide what version of the California disclaimer to use on pur printer friendly forms.�[m
�[32m+�[m�[32m case REPORT_COMPONENT_PRINTER_FRIENDLY_HEADER:�[m
�[32m+�[m�[32m $eligReleaseV2 = Params::get('screening', 'report_component_printer_friendly_header.release.v2');�[m
�[32m+�[m�[32m if ($eligReleaseV2 <= $createTime) {�[m
�[32m+�[m�[32m $renderVersion = 2;�[m
�[32m+�[m�[32m }�[m
�[32m+�[m�[32m break;�[m
�[32m+�[m
case REPORT_COMPONENT_PHYSICAL_CRIMINAL_SEX_OFFENDER:�[m
case REPORT_COMPONENT_TALENTSHIELD_PHYSICAL_CRIMINAL_SEX_OFFENDER:�[m
case REPORT_COMPONENT_PHYSICAL_CRIMINAL_NATIONWIDE_DB:�[m
�[1mdiff --git a/core/inc/defines.inc.php b/core/inc/defines.inc.php�[m
�[1mindex a703c53..5082859 100755�[m
�[1m--- a/core/inc/defines.inc.php�[m
�[1m+++ b/core/inc/defines.inc.php�[m
�[36m@@ -195,8 +195,9 @@�[m �[mdefine ("REPORTYPE_CONSUMER_DMV", "57"); //Consumer Driving Report�[m
// NOTE::::::::::::::: PLEASE DO NOT ADD ANY REPORT TYPE WITHOUT TALKING TO NIRAJ�[m
�[m
/*******************************************************************_**/�[m
�[31m-//Defines the possible ReportComponents�[m
�[31m-//DO NOT CHANGE THE DEFINE NUMBERS ONCE THEY HAVE BEEN USED IN PRODUCTION�[m
�[32m+�[m�[32m// Defines the possible ReportComponents�[m
�[32m+�[m�[32m// DO NOT CHANGE THE DEFINE NUMBERS ONCE THEY HAVE BEEN USED IN PRODUCTION�[m
�[32m+�[m�[32m// Kept in Intelius.Packages.ReportComponents (| separated list, search for LIKE %value%)�[m
define ("REPORT_COMPONENT_NONE", "0");�[m
define ("REPORT_COMPONENT_SUMMARY", "1");�[m
define ("REPORT_COMPONENT_PROPERTY", "7");�[m
�[36m@@ -287,6 +288,7 @@�[m �[mdefine ("REPORT_COMPONENT_PHYSICAL_CRIMINAL_1_STATE", "96"); // Like 110 (natcri�[m
define ("REPORT_COMPONENT_ISERVICES_SEX_OFFENDER", "97");�[m
define ("REPORT_COMPONENT_PHYSICAL_CIVILCOUNTY", "98");�[m
define ("REPORT_COMPONENT_SOCIAL_NET_SUMMARY", "99");�[m
�[32m+�[m�[32mdefine ("REPORT_COMPONENT_PRINTER_FRIENDLY_HEADER", "100");�[m
define ("REPORT_COMPONENT_PHYSICAL_PHYSICAL_EXAM", "101"); // Physical Physical -- I meant to do that�[m
define ("REPORT_COMPONENT_PHYSICAL_ESCREEN_DRUG_SCREEN", "102");�[m
define ("REPORT_COMPONENT_INCART_ACCEPTANCE_MARKETING", "103");�[m
�[1mdiff --git a/core/inc/uberreport.inc.php b/core/inc/uberreport.inc.php�[m
�[1mold mode 100644�[m
�[1mnew mode 100755�[m
�[1mindex 36f768f..202d377�[m
�[1m--- a/core/inc/uberreport.inc.php�[m
�[1m+++ b/core/inc/uberreport.inc.php�[m
�[36m@@ -173,6 +173,9 @@�[m �[mclass UberReport�[m
public function DisplayUberReport($Owner, $GetUserObject, $applicantId, $isPrinterFriendlyPage = 0, $echoOut = TRUE)�[m
{�[m
global $SiteConfigCore;�[m
�[32m+�[m
�[32m+�[m�[32m // guarantee the define of the theme class.�[m
�[32m+�[m�[32m require_once('inc/template.inc.php');�[m
theme::factory()->addDir('screening/tpl'); // In case we don't have it yet�[m
�[m
// ReportContext supersedes isPrinterFriendlyPage�[m
�[36m@@ -971,18 +974,57 @@�[m �[mclass UberReport�[m
$includeFCRA = false;�[m
}�[m
�[m
�[31m- $civilCode = '';�[m
�[31m- if ($includeFCRA){�[m
�[31m- // for internation uberform, we do not show Civil code nor FCRA�[m
�[31m- $civilCode = 'Per California Civil Code 1786, ';�[m
�[32m+�[m�[32m // figure out what render version to use.�[m
�[32m+�[m�[32m $FakeReqProfile = array(�[m
�[32m+�[m�[32m 'App' => REPORT_COMPONENT_PRINTER_FRIENDLY_HEADER,�[m
�[32m+�[m�[32m 'CreateTime' => $this->m_CreateTime,�[m
�[32m+�[m�[32m );�[m
�[32m+�[m�[32m $FakeReportData = array();�[m
�[32m+�[m�[32m $renderVersion = screening_Report_UI::getRenderVersion($FakeReqProfile, $FakeReportData, $this->m_ReportContext);�[m
�[32m+�[m
�[32m+�[m�[32m if ($renderVersion === 1) {�[m
�[32m+�[m�[32m $civilCode = '';�[m
�[32m+�[m�[32m if ($includeFCRA){�[m
�[32m+�[m�[32m // for internation uberform, we do not show Civil code nor FCRA�[m
�[32m+�[m�[32m $civilCode = 'Per California Civil Code 1786, ';�[m
�[32m+�[m�[32m }�[m
�[32m+�[m
�[32m+�[m�[32m // 12pt font is a legal requirement that presumably applies to Web as well as print�[m
�[32m+�[m�[32m $legalTopHtml = '

'.$civilCode.$this->m_SiteName.' does not�[m
�[32m+�[m�[32m guarantee the accuracy or truthfulness of the information in this report as to the�[m
�[32m+�[m�[32m person who is the subject of the investigation, only that the information is accurately copied from�[m
�[32m+�[m�[32m public records. Information generated as a result of identity theft, including evidence of�[m
�[32m+�[m�[32m criminal activity, may be inaccurately associated with the person who is the subject of the report.
'."\r\n";�[m
�[32m+�[m�[32m } else {�[m
�[32m+�[m�[32m // New and Improved, with a fresh spring scent!�[m
�[32m+�[m�[32m // 12pt font is a legal requirement that presumably applies to Web as well as print�[m
�[32m+�[m�[32m $legalTopHtml = '
�[m
�[32m+�[m�[32m California Applicants/Employees Only: The report does not guarantee the�[m
�[32m+�[m�[32m accuracy or truthfulness of the information as to the subject of the�[m
�[32m+�[m�[32m investigation, but only that it is accurately copied from public records,�[m
�[32m+�[m�[32m and information generated as a result of identity theft, including�[m
�[32m+�[m�[32m evidence of criminal activity, may be inaccurately associated with the�[m
�[32m+�[m�[32m consumer who is the subject of the report. An investigative consumer�[m
�[32m+�[m�[32m reporting agency shall provide a consumer seeking to obtain a copy of a�[m
�[32m+�[m�[32m report or making a request to review a file, a written notice in simple,�[m
�[32m+�[m�[32m plain English and Spanish setting forth the terms and conditions of his�[m
�[32m+�[m�[32m or her right to receive all disclosures, as provided in Section�[m
�[32m+�[m�[32m 1786.26.
�[m
�[32m+�[m�[32m
�[m
�[32m+�[m�[32m Sólo para los Solicitantes/Empleados de California: En el informe no se�[m
�[32m+�[m�[32m garantiza la exactitud o veracidad de la información en cuanto al tema�[m
�[32m+�[m�[32m de la investigación, sino sólo que se ha copiado exactamente de los�[m
�[32m+�[m�[32m registros públicos, y la información generada como resultado del robo�[m
�[32m+�[m�[32m de identidad, incluyendo las pruebas de una actividad delictiva, podría�[m
�[32m+�[m�[32m estar incorrectamente asociada con el consumidor que sea el sujeto del�[m
�[32m+�[m�[32m informe. Una agencia investigadora de informes de crédito deberá�[m
�[32m+�[m�[32m suministrarle a un consumidor que trate de obtener una copia de un�[m
�[32m+�[m�[32m informe o solicite revisar un archivo una notificación por escrito en�[m
�[32m+�[m�[32m inglés y español lisos y llanos, en la que se establezcan los términos�[m
�[32m+�[m�[32m y las condiciones de su derecho a recibir toda la información, como se�[m
�[32m+�[m�[32m dispone en la Sección 1786.26.�[m
�[32m+�[m�[32m
'."\r\n";�[m
}�[m
�[31m-�[m
�[31m- // 12pt font is a legal requirement that presumably applies to Web as well as print�[m
�[31m- $legalTopHtml = '
'.$civilCode.$this->m_SiteName.' does not�[m
�[31m- guarantee the accuracy or truthfulness of the information in this report as to the�[m
�[31m- person who is the subject of the investigation, only that the information is accurately copied from�[m
�[31m- public records. Information generated as a result of identity theft, including evidence of�[m
�[31m- criminal activity, may be inaccurately associated with the person who is the subject of the report.
'."\r\n";�[m
}�[m
�[m
$legalBottomHtml = '';�[m
�[1mdiff --git a/tests/unit/DataProvider/UserProvider.php b/tests/unit/DataProvider/UserProvider.php�[m
�[1mnew file mode 100755�[m
�[1mindex 0000000..80cf4d2�[m
�[1m--- /dev/null�[m
�[1m+++ b/tests/unit/DataProvider/UserProvider.php�[m
�[36m@@ -0,0 +1,34 @@�[m
�[32m+�[m�[32m�[m
\ No newline at end of file�[m
�[1mdiff --git a/tests/unit/core/inc/UberReportTest.php b/tests/unit/core/inc/UberReportTest.php�[m
�[1mnew file mode 100755�[m
�[1mindex 0000000..44da4b2�[m
�[1m--- /dev/null�[m
�[1m+++ b/tests/unit/core/inc/UberReportTest.php�[m
�[36m@@ -0,0 +1,49 @@�[m
�[32m+�[m�[32muberReport = $uberReport;�[m �[32m+�[m�[32m }�[m �[32m+�[m �[32m+�[m �[32m+�[m�[32m // This is the weakest unit test ever.�[m �[32m+�[m�[32m // We call it, and make sure the report has the header we coded for.�[m �[32m+�[m�[32m public function testGetUberFormUser()�[m �[32m+�[m�[32m {�[m �[32m+�[m�[32m $userId = 16520012;�[m �[32m+�[m�[32m $userProvider = new UserProvider();�[m �[32m+�[m�[32m $Owner = $userProvider->getUser($userId);�[m �[32m+�[m�[32m $GetUserObject = array($userId);�[m �[32m+�[m�[32m $applicantId = 60924957;�[m �[32m+�[m�[32m // This is critical to our test. It MUSt be a printer friendly page to have the header.�[m �[32m+�[m�[32m $isPrinterFriendlyPage = 1;�[m �[32m+�[m�[32m $echoOut = false;�[m �[32m+�[m �[32m+�[m�[32m $this->uberReport->DisplayUberReport($Owner, $GetUserObject, $applicantId, $isPrinterFriendlyPage, $echoOut);�[m �[32m+�[m �[32m+�[m�[32m // Lets see what damage it's wrought.�[m �[32m+�[m�[32m $className = "UberReport";�[m �[32m+�[m�[32m $propertyName = "m_ReportHtml";�[m �[32m+�[m�[32m $object = $this->uberReport;�[m �[32m+�[m�[32m $m_ReportHtml = $this->getPrivateProperty($className, $propertyName, $object);�[m �[32m+�[m �[32m+�[m�[32m // ALL we changed is that printed reports, done AFTER the params date should have the header section.�[m �[32m+�[m�[32m $this->assertTrue(strstr($m_ReportHtml, '1786.26') !== true, "Report does NOT have the correct 1786.26 header.");�[m �[32m+�[m�[32m }�[m �[32m+�[m�[32m}�[m �[32m+�[m�[32m?>�[m
\ No newline at end of file�[m

@pixelb
Copy link
Owner

pixelb commented Oct 8, 2014

Could you attach the diff in an email to [email protected].

Does the original version of the script before the recent awk change work any better?
https://raw.githubusercontent.com/pixelb/scripts/bd2aabd/scripts/ansi2html.sh

@John-Schlick
Copy link
Author

Good question. The reason I came here to get the newest version is that I was using the older version, and thought I'd try the new one. So...

Nope, it also dies with this character.

I'll attach the diff --color (which is a .txt file) to an email to you, as well as the output html that is cut short.

@pixelb
Copy link
Owner

pixelb commented Oct 9, 2014

Seems locale related. I can get new or old script to misbehave when I give it your UTF8 input, but with the locale variables not set. Though I can't get output truncated like you do. I presume there is an error message on stderr when this truncation happens? Are you running the script with a weird environment? If not what is the output from the command: locale

@John-Schlick
Copy link
Author

I'm just running it on qa linux box, I don't >>think<< that anything is weird about it.

jschlick@dvm-jschlick2:/usr/local/html(BGS-1516)$ locale
LANG=en_US
LANGUAGE=
LC_CTYPE="en_US"
LC_NUMERIC="en_US"
LC_TIME="en_US"
LC_COLLATE="en_US"
LC_MONETARY="en_US"
LC_MESSAGES="en_US"
LC_PAPER="en_US"
LC_NAME="en_US"
LC_ADDRESS="en_US"
LC_TELEPHONE="en_US"
LC_MEASUREMENT="en_US"
LC_IDENTIFICATION="en_US"
LC_ALL=

(I've never used this command before, so I just blindly typed it, and this is the output.)

@pixelb
Copy link
Owner

pixelb commented Oct 9, 2014

You're in an is0-8859-1 locale. It's unusual to not be in a UTF8 locale these days.
A probable workaround would be to set a UTF8 locale first like:

git diff | (export LC_ALL=en_US.utf8; ansi2html.sh) > blah.html

I'll work on making it more locale agnostic

@John-Schlick
Copy link
Author

Your workaround works for this case.
thanks for figuring it out, I'd have probably never gotten there (since I don't actually know what that export does...)

If you do make this more locale agnostic, please let me know, and I'll happily take the new version and apply it to the entire company here.

@pixelb
Copy link
Owner

pixelb commented Jan 26, 2015

The latest version is now more locale agnostic. Could you try it out? I've not marked this bug as fixed though as I wasn't able to recreate your output truncation issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants