Working with Non-ASCII JDBC Data in Talend

When testing Talend with the Easysoft JDBC-ODBC Bridge, we experienced text corruption when writing non-ASCII data from a SQL Server database to a CSV format file.

The workaround was to change the data type for the problem column in the Talend schema from a String to a byte[]. To do this, we:

  1. Accessed the tJDBCInput component's properties.
  2. Chose the Edit Schema Button.
  3. Changed the data type definition for the relevant column.

Talend's default character encoding is UTF-8. As long as data returned by a JDBC driver is ASCII, data encoded with a different character encoding isn't an issue: UTF-8 data that contains only ASCII characters is identical to data.