<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=2877026&amp;fmt=gif">

Talend BigData Installation on Linux

By Sorabh Jain - August 27, 2021

This blog is a step-by-step tutorial on how to install and configure the core components of the Talend BigData suite on a Linux Environment

The version of Talend used to write this blog is 7.2.1

The instructions outlined in the blog are relevant to:

  • Companies with at least a TAC (Talend Administration Center) license
  • Users who wish to follow security practices to set up Talend bigData
  • Users who require SSL encryption for Talend components
  • Users who wish to follow the Red hat Packet Manager (RPM) method for a seamless installation of talend modules
  • Users who wish to configure TAC with MariaDB database

and does not include:

  • Automated installation scripts/configuration for provisioning systems such as Puppet, Chef, etc .
  • Enable high-availability on Talend components

The following components are addressed in this Blog:

  • GitServer
  • TAC
  • Nexus
  • JobServer
  • Configuring TAC and Talend JobServer to use SSL

Download Talend Files

Download all the necessary files from the Talend website to a directory in the desired server (TAC server preferred). The list of talend components’ links is provided by Talend via email, along with a license file.Credentials (username and password) are provided in the license email sent by Talend.

mkdir /opt/talend
cd /opt/talend
cat >talenddownloadurls.txt <<EOF
http://www.opensourceetl.net/tis/tdf_721/Talend-Studio-20180411_1414-V7.0.1.zip
http://www.opensourceetl.net/tis/tdf_721/Talend-Studio-20180411_1414-V7.0.1.zip.MD5
http://www.opensourceetl.net/tis/tdf_721/Talend-AdministrationCenter-20180411_1414-V7.0.1.zip
http://www.opensourceetl.net/tis/tdf_721/Talend-AdministrationCenter-20180411_1414-V7.0.1.zip.MD5
http://www.opensourceetl.net/tis/tdf_721/Talend-JobServer-20180411_1414-V7.0.1.zip
http://www.opensourceetl.net/tis/tdf_721/Talend-JobServer-20180411_1414-V7.0.1.zip.MD5
EOF
wget --input-file=talenddownloadurls.txt --user=.... --password=....

Installing Talend BigData platform using RPM

Talend provides RPM packages that allow you to deploy applications and services easily. You can deploy and install RPM packages individually as detailed in this blog.

1. Install Version Control System (Gitblit )

Talend needs a git or svn repository for developers to store data in, and for the Talend build-tools to compile code from.

a. Create a git directory at the root of your system. For example :

$ mkdir -p /opt/talend/gitblit ; cd /opt/talend/gitblit

b. Download and extract the gitblit package

$ sudo wget http://dl.bintray.com/gitblit/releases/gitblit-1.8.0.tar.gz

Extract the downloaded tarball package to /opt/talend/gitblit/

$ sudo tar -zxvf gitblit-1.8.0.tar.gz

c. Add and use gitblit as a service using the following steps:

[/opt/talend/gitblit]# cp service-centos.sh /etc/init.d/gitblit
[/opt/talend/gitblit]# chkconfig — add gitblit
[/opt/talend/gitblit]# service gitblit start
Basefolder : /opt/talend/gitblit/gitblit-1.8.0/data
Settings : /opt/talend/gitblit/gitblit-1.8.0/data/gitblit.properties

Open any browser and go to http://gitserver:8080/. Use the default Admin credentials, admin username / admin password and verify the login to gitblit.

2. Installing and configuring TAC

TAC is a web-based administration application that gives access to all management and administration functionalities for an integration project.

JAVA_HOME variable should be set to the correct Java home directory : /usr/lib/jvm/jre-1.8.0-openjdk
1. Create a file talend.repo in the /etc/yum.repos.d directory containing the following configuration:

[talend-7.2.1]
name=Talend 7.2.1
baseurl=’https://<user>:<password>@www.opensourceetl. net/rpms/talend/7.2.1/base/x86_64/'
enabled=1
gpgcheck=1
gpgkey=http://www.opensourceetl.net/rpms/GPG-KEY-talend

2. Install Tomcat with RPM
We are installing Tomcat using the RPM provided by Talend

sudo yum install talend-tomcat
Tomcat Configuration files : /opt/talend/tomcat/conf
Tomcat Logs location : /opt/talend/tomcat/logs

3. Install Talend Administration Center


sudo yum install talend-tac

Start, stop and check the status of the TAC service using the systemd as below:

  • Start the service using the following command:
    sudo systemctl start talend-tac
  • Stop the service using the following command:
    sudo systemctl stop talend-tac
  • Check the status of the service using the following command:
    sudo systemctl status talend-tac
    TAC Configuration files : /etc/talend/tac
    TAC Log files : /opt/talend/tac/archive/logs

Install the Mysql driver; mysql-connector-java.jar on TAC server in the below path :
/opt/talend/tac/tomcat/webapps/org.talend.administrator/WEB-INF/lib

4 . Configure TAC to use a MariaDB database :

4.1) Open the browser and visit the following URL:

http://tacserver:8080/org.talend.administrator/

4.2) Enter the default admin password. MariaDB database connection parameters will be displayed and automated checks performed on the jdbc driver, URL, connection, and version information

Url : jdbc:mariadb://talenddbhost:3306/talend_admin
User : talend_admin
Driver : org.mariadb.jdbc.Driver

4.3) Click ‘Set new license’. Browse the License file received on email from Talend and Upload

4.4) Click on Login

4.5) Visit the login page, enter the default login credentials for the first access (login: security@company.com, password: admin)

3. Installing and configuring Nexus

A Maven artifact repository is needed by various Talend components to store software updates and Data Integration Job artifacts.

  #mkdir /opt/talend/
cd /opt/talend
# download the nexus tarball
wget http://download.sonatype.com/nexus/oss/nexus-2.14.14-01-bundle.tar.gz — no-check-certificate
tar xvzf nexus-2.14.14–01-bundle.tar.gz adduser nexus (# Create a user that will be used to run Nexus) chown -R nexus:nexus /opt/talend/nexus-2.14.14–01/ cd /opt/talend/nexus-2.14.14–01/bin ln -s nexus /etc/init.d/nexus sudo systemctl enable nexus /sbin/chkconfig nexus on chown -R nexus:nexus /opt/talend/sonatype-work systemctl start nexus (# start the Nexus service)

Open the browser and enter http://nexus_host:8081/nexus to explore Nexus.

4. Installing and configuring Talend JobServer

The Jobserver is a simple agent that a client (usually the TAC or TalendStudio) can send Java jarfiles to. The jobserver then executes that jarfile in a separate JVM process. The jobserver also reports various statistics back to its client while the job is running.

The installation process should be repeated on each host that you wish to run Talend jobs on — which might or might not include the server on which the TAC(s) run.

1.Download and install public signing key using the following command:
rpm — import http://www.opensourceetl.net/rpms/GPG-KEY-talend

2. on root .bashrc

export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk
export PATH=$JAVA_HOME/bin:$PATH

3. Create a file talend.repo at /etc/yum.repos.d directory, containing the following configuration:

[talend-7.2.1]
name=Talend 7.2.1
baseurl=’https://<user>:<password>@www.opensourceetl.net/rpms/ talend/7.2.1/base/x86_64/'
enabled=1
gpgcheck=1
gpgkey=http://www.opensourceetl.net/ rpms/GPG-KEY-talend

4. sudo yum install talend-jobserver

Start, stop and check the status of the Talend JobServer using systemd.

  • Start the service using the following command:
    sudo systemctl start talend-jobserver
  • Stop the service using the following command:
    sudo systemctl stop talend-jobserver
  • Check the status of the service using the following command:
    sudo systemctl status talend-jobserver

Configuration files : /opt/talend/jobserver/conf

5.Configure TAC and Talend JobServer to use SSL

This section describes how to configure SSL transport and authentication for Talend JobServer and TAC.

1. Enable SSL on TAC :

  • Shutdown Tomcat
  • Navigate to the Tomcat conf sub-folder
  • Edit the server.xml file
  • Locate and uncomment the SSL-enabled HTTP connector (it is commented out by default)
  • Modify the connector config* as follows:
<Connector port=”8080" protocol=”HTTP/1.1" SSLEnabled=”true”
maxThreads=”150" scheme=”https” secure=”true”
clientAuth=”false” sslProtocol=”TLS”
keystoreFile=”full path to keystore file from above”
keystorePass=”talend”/>
  • Remove/disable the tcnative-1.dll DLL from the Tomcat bin folder (move to an archive directory or rename if you are unsure — e.g.: rename to tcnative-1.dll.DISABLED)
  • Restart Tomcat, and check that the https protocol is supported by navigating to the base Tomcat landing page over HTTPS

2. Enabling SSL for Talend JobServer Command Port and File Port:

  • Make the following changes on jobserver/agent/conf/TalendJobServer.properties
         org.talend.remote.server.ssl.keyStore=/opt/cloudera/security
/jks/localhost-keystore.jks
org.talend.remote.server.ssl.keyStorePassword=*******
org.talend.remote.server.ssl.trustStore=/usr/lib/jvm/java-1.8.0- openjdk-1.8.0.222.b10–1.el7_7.x86_64/jre/lib/security/jssecacerts
org.talend.remote.server.ssl.trustStorePassword=*******
org.talend.remote.server.ssl.authenticate=true
org.talend.remote.jobserver.server.TalendJobServer.USE_SSL=true
  • # systemctl restart talend-jobserver

For the TAC, we can use tac/apache-tomcat/bin/setenv.sh to extend the JAVA_OPTS specification:

export JAVA_OPTS=”$JAVA_OPTS -Xmx2048m -Dfile.encoding=UTF-8 -Dorg.talend.remote.client.ssl.keyStore=…”

Author
Sorabh Jain

BigData Administrator at Clairvoyant LLC

Tags: Big Data Data Engineering Talend Talend Open Studio Etl Data Warehouse