Easysoft Blog

Thu, 29 June 2016

Using ODBC Data in RapidMiner

The RapidMiner distribution includes a generic connector that enables data sets to be constructed from any database for which an ODBC driver is available. The connector is called the JDBC-ODBC Bridge and is an integral part of Java 1.7 and earlier. The implications of that are:

If you need to connect RapidMiner to a 32-bit ODBC driver, use the Easysoft JDBC-ODBC Bridge instead. The Easysoft JDBC-ODBC Bridge has a Java component and a 32-bit native component. It is fully supported by Easysoft and is currently being deployed in production environments throughout the world.

You need to use the Professional version of RapidMiner to use third party drivers such as the Easysoft JDBC-ODBC Bridge with it. A trial version of RapidMiner Professional is available to users who create an account on the RapidMiner web site.

If you do not already have RapidMiner Professional, follow these steps:

  1. Create a RapidMiner account and then verify the email address you supplied when creating the account.
  2. Download RapidMiner Studio for your Windows platform.
  3. Start RapidMiner and then log in when prompted.

    Your RapidMiner Professional trial period has now started.

To connect RapidMiner to some ODBC data:

  1. Configure a System ODBC data source for the database that you want to connect to in RapidMiner.

    To do this, use the 32-bit version of ODBC Data Source Administrator on your RapidMiner machine. On some versions of Windows, this is located in Control Panel > Administrative Tools. On some version of Windows, you need to search for ODBC in the taskbar search box. The 32-bit version of ODBC Data Source Administrator should be clearly labelled. If in doubt, in the Windows Run dialog box, type:

    %windir%\syswow64\odbcad32.exe

    We wanted to try RapidMiner with some Microsoft Access data, and so we created an Access ODBC driver data source that pointed to the Northwind database.

  2. Download the Easysoft JDBC-ODBC Bridge. (Registration required.)
  3. Install and license the Easysoft JDBC-ODBC Bridge on the machine where the RapidMiner is installed.

    For installation instructions, see the Easysoft JDBC-ODBC Bridge documentation.

  4. In RapidMiner, choose Connections > Manage Database Drivers.

    The Manage Database Drivers dialog box is displayed.

  5. Choose Add.
  6. Complete the dialog box fields:
    FieldValue
    NameEasysoft JDBC-ODBC Bridge
    URL prefixjdbc:easysoft://
    Port(Leave blank)
    Schema separator(Leave blank)
    Jar fileeasysoft_install_dir\Jars\EJOB.jar

    On 64-bit Windows, the default location for <easysoft_install_dir> is <drive>:\Program Files (x86)\Easysoft Limited\Easysoft JDBC-ODBC Bridge.

    On 32-bit Windows, the default location for <easysoft_install_dir> is <drive>:\Program Files\Easysoft Limited\Easysoft JDBC-ODBC Bridge.

    JDBC Classeasysoft.sql.jobDriver
  7. Choose Save.
  8. Choose Connections > Manage Database Connections.

    The Manage Database Connections dialog box is displayed.

  9. Choose New.
  10. Complete the dialog box fields:
    FieldValue
    NameJDBC-ODBC Bridge
    Database systemEasysoft JDBC-ODBC Bridge
    Hostlocalhost/
    Port(Leave blank)
    Dabase schememy_odbc_data_source
  11. Choose Advanced

    The Advanced Connection Properties dialog box is displayed.

  12. Complete the dialog box fields:
    FieldValue
    logonusermy_windows_user
    logonpasswordmy_windows_password

    Select Override for both these fields.

  13. Use the Test button to verify that you have correctly entered the connection details.

To analyse the ODBC data made accessible via the JDBC-ODBC Bridge, create a new RapidMiner process:

  1. Choose File > New Process.
  2. Choose a blank process when prompted.
  3. Use the Repository pane to locate the ODBC data. We used the JDBC-ODBC Bridge to expose some Access data in the Northwind database: DB > JDBC-ODBC Bridge > Example Sets > Customers.
  4. In Design view, drag a table from the Repository pane to the Process pane. In our case, the table was Customers.
  5. Drag a Blending > Values > Set Data operator from the Operators pane to the Process pane.
  6. Drag the output of the retrieve operator to the Set Data operator.
  7. Use the Set Data parameters tab to alter a column value. For example, we altered a job title in our customer data from Sales Representative to Sales Manager:
    FieldValue
    Example index1
    Attribute nameCustomerID
    ValueALFKI
    Additional valuesEdit List (0) ... > attribute name ContactTitle

    Edit List (0) ... > value Sales Manager

  8. Drag the output of the Set Data operator to the res button and the choose Play.

    The modified data is displayed in the Results view.