{"id":118,"date":"2009-08-27T08:10:34","date_gmt":"2009-08-27T07:10:34","guid":{"rendered":"http:\/\/lendl.priv.at\/blog\/2009\/08\/27\/cryptopensslx509-and-utf-8-characters\/"},"modified":"2026-01-26T12:16:12","modified_gmt":"2026-01-26T11:16:12","slug":"cryptopensslx509-and-utf-8-characters","status":"publish","type":"post","link":"https:\/\/lendl.priv.at\/blog\/2009\/08\/27\/cryptopensslx509-and-utf-8-characters\/","title":{"rendered":"Crypt::OpenSSL:X509 and UTF-8 strings"},"content":{"rendered":"<p><em>Bumped to top due to updates.<\/em><\/p>\n<p>For my current project I look at a lot of <a href=\"http:\/\/en.wikipedia.org\/wiki\/X509\">X.509 certificates<\/a> using Dan Sully&#8217;s <a href=\"http:\/\/search.cpan.org\/dist\/Crypt-OpenSSL-X509\/\">Crypt::OpenSSL:X509<\/a> Perl module. I&#8217;m not using the version from CPAN, but his current codebase straight from his <a href=\"http:\/\/github.com\/dsully\/perl-crypt-openssl-x509\/tree\/master\">git repository<\/a>.<\/p>\n<p>While trying to store information about certs in a PostgreSQL DB which is set to UTF-8 strings, I encountered errors. Some debugging later I found that some of the certs had Umlauts in the subject field. The XS code from Crypt::OpenSSL:X509 wasn&#8217;t UTF-8 aware, causing automatic down-conversion to ISO-8859-1, which produced illegal byte sequence when parsed as UTF-8.<\/p>\n<p>After some cursing and debugging I came up with this <a id=\"p119\" href=\"x509xs.diff\" title=\"X509.xs.diff\">patch<\/a>:<\/p>\n<p><code><br \/>\n--- ..\/dsully-perl-crypt-openssl-x509\/X509.xs  2009-03-06 22:22:44.000000000 +0100<br \/>\n+++ X509.xs     2009-08-17 14:46:00.000000000 +0200<br \/>\n@@ -73,6 +73,15 @@<br \/>\n        return sv;<br \/>\n }<\/p>\n<p>+static SV* sv_bio_utf8_on(BIO *bio) {<br \/>\n+<br \/>\n+       SV* sv;<br \/>\n+       sv = (SV *)BIO_get_callback_arg(bio);<br \/>\n+       SvUTF8_on(sv);<br \/>\n+       return sv;<br \/>\n+}<br \/>\n+<br \/>\n+<br \/>\n \/*<br \/>\n static void sv_bio_error(BIO *bio) {<\/p>\n<p>@@ -293,8 +302,10 @@<br \/>\n                        name = X509_get_issuer_name(x509);<br \/>\n                }<\/p>\n<p>+               \/* this need not be pure ascii, try to get a native perl character string with utf8 *\/<br \/>\n+               sv_bio_utf8_on(bio);<br \/>\n                \/* this is prefered over X509_NAME_oneline() *\/<br \/>\n-               X509_NAME_print_ex(bio, name, 0, XN_FLAG_SEP_CPLUS_SPC);<br \/>\n+               X509_NAME_print_ex(bio, name, 0, (XN_FLAG_SEP_CPLUS_SPC | ASN1_STRFLGS_UTF8_CONVERT) & ~ASN1_STRFLGS_ESC_MSB);<\/p>\n<p>        } else if (ix == 3) {<\/p>\n<p>@@ -799,7 +810,8 @@<br \/>\n             n = OBJ_nid2sn(nid);<br \/>\n         }<br \/>\n         BIO_printf(bio, \"%s=\", n);<br \/>\n-        ASN1_STRING_print(bio, X509_NAME_ENTRY_get_data(name_entry));<br \/>\n+       sv_bio_utf8_on(bio);<br \/>\n+        ASN1_STRING_print_ex(bio, X509_NAME_ENTRY_get_data(name_entry),ASN1_STRFLGS_UTF8_CONVERT & ~ASN1_STRFLGS_ESC_MSB);<br \/>\n        RETVAL = sv_bio_final(bio);<\/p>\n<p>     OUTPUT:<br \/>\n<\/code><\/p>\n<p>Basically, this just tells the openssl library to output UTF-8 and the perl core that the new strings are encoded in UTF-8.<\/p>\n<p>This might be overkill in the cases where it&#8217;s not actually needed, but it should do no harm<strong>.<\/p>\n<p>Update:<\/strong> My patch is now in the <a href=\"http:\/\/github.com\/dsully\/perl-crypt-openssl-x509\/commit\/91126fce9efa42a50dc9f824f1e2728e440d4119\">git repository<\/a>.<\/p>\n<p><strong>Update2:<\/strong> Life is not that easy. Looking at more X.509 certs in the wild shows that openssl does not check whether it returns a valid UTF8 string. So stay tuned for additional patches in Dan&#8217;s git repository.<\/p>\n<p><strong>Update3:<\/strong> My patches are now integrated in the <a href=\"http:\/\/github.com\/dsully\/perl-crypt-openssl-x509\/tree\/master\">git version<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Bumped to top due to updates. For my current project I look at a lot of X.509 certificates using Dan Sully&#8217;s Crypt::OpenSSL:X509 Perl module. I&#8217;m not using the version from CPAN, but his current codebase straight from his git repository. While trying to store information about certs in a PostgreSQL DB which is set to [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4,11],"tags":[],"class_list":["post-118","post","type-post","status-publish","format-standard","hentry","category-cert","category-system-administration"],"_links":{"self":[{"href":"https:\/\/lendl.priv.at\/blog\/wp-json\/wp\/v2\/posts\/118","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lendl.priv.at\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lendl.priv.at\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lendl.priv.at\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/lendl.priv.at\/blog\/wp-json\/wp\/v2\/comments?post=118"}],"version-history":[{"count":1,"href":"https:\/\/lendl.priv.at\/blog\/wp-json\/wp\/v2\/posts\/118\/revisions"}],"predecessor-version":[{"id":981,"href":"https:\/\/lendl.priv.at\/blog\/wp-json\/wp\/v2\/posts\/118\/revisions\/981"}],"wp:attachment":[{"href":"https:\/\/lendl.priv.at\/blog\/wp-json\/wp\/v2\/media?parent=118"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lendl.priv.at\/blog\/wp-json\/wp\/v2\/categories?post=118"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lendl.priv.at\/blog\/wp-json\/wp\/v2\/tags?post=118"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}