New Article

Talend Certification Tips - Tip #002 : Know your tMap Joins

Image
Hi Talend Addicts, Ready to learn more about the Talend great ETL tool! This post is part of the new series discussing the most important things (Tips) to know for anybody who want to tackle the Talend Open Studio for Data Integration Certification Exam. Today's Tip is:  Tip #002: Know your tMap Joins! tMap is one, if not THE, most important component in a Talend Job.  P ersonally  I have never seen any Talend Job without at least one occurence of this component, and  believe  it or not many questions around this component on the certification exam! the tMap can be used to (obviously) map data from input to output, but even more. Map data - already said that, but very important, so I say it again! Filter input data Filter output data Transform data Join data In today's post I'll focus on the last but not least one - JOIN Data So let's kick this off with some examples; and see what are the different types of joins you can build with Talend tMap component. Setting up the

Integrate job with Windows Explorer

In this tutorial we are going to learn how to integrate a job with the Windows Explorer so we can run our job on a file using the context menu.


1.Introduction

Before I start this tutorial I want to thank Ian Mayo for sponsoring this article in support of PlanetMayo Open Source projects.

In this document we are going to integrate the job that we have created in the last tutorial (Export job that can be run from command line) with Windows Explorer and making it usable just by doing a right click on a file or multiple files. 

To perform that we are going to follow those steps :
  • Creating a batch file that executes the Talend job.
  • Adding that batch file to the context menu.

2. Batch file :

The batch file is quit simple. First thing the batch should do is to navigate to the folder where the job is. We will do this with a "cd" command like this :

cd C:\Users\ELHASSMU\Desktop\issue 1\Export\Issue_1_job_0.1\Issue_1_job

Next, we are going to retrieve the path of the file on which our job is going to run and put it in a variable called "if" like this :

set if=%1

Next, we create another variable called "of" and we put on it the path of the output file. This path will be the path of the input file with "_1" put just before the ".csv" suffix. For example if the input file is called "toto.csv " then the output file will be "toto_1.csv". This can be done with the following code :

set of=%if:.csv =%
for /F "delims=c"  %%a in ("%if%") do set of=%%a
set of=%of:.=%
set of=%of%_1.csv

Then, to avoid a problem related to JAVA concerning paths with backslashes we replace all the backslashes with regular slashes using this code :

set of=%of:\=/%
set if=%if:\=/%

For the final parameter "att" we just put  a default value "TUTU" like this :

set att="TuTu"

Then we just display the input and output files paths with a "echo" command :

echo The  input file is %if%
echo The output file is %of%

Finally we call our Talend job using "JAVA" command and passing our parameters to it :

java -Xms256M -Xmx1024M -cp classpath.jar; converg.issue_1_job_0_1.Issue_1_job --context=Default --context_param input_file=%if% --context_param output_file=%of% --context_param attribute=%att%

The whole code will be :









 

3. The context menu :

Adding our job to the context menu has to pass by using the "regedit" command.
Open regedit.exe from the start menu search or run box, and then browse to this key: 

HKEY_CLASSES_ROOT\*\shell

Right-click on “shell” and create a new key, calling it “Run Issue_2”. Create a new key below called “command”.











 Double click on the (Default) value in the right-hand pane and enter in the following:

C:\Users\ELHASSMU\Desktop\issue 1\Export\Issue_1_job_0.1\Issue_1_job\run.bat %1


As you can see, we entered the absolute path of the batch file created on section 2 C:\Users\ELHASSMU\Desktop\issue 1\Export\Issue_1_job_0.1\Issue_1_job\run.bat and we add at the end the value %1. It's the %1 that is going to send to our batch file the path of the selected file or files. You should obtain something like that :












Press "OK" and the change should take effect immediately, just right-click on any file and you’ll see a new item in the context menu called "Run Issue_2".












Note :
All the steps described before can be replaced by an automatic script. This script will add the key registries for us with just one double click. For our example you can create a file called "Add_Registry" with the extension ".reg" and you put in it this code :

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\Run Issue_2]

[HKEY_CLASSES_ROOT\*\shell\Run Issue_2\command]
@="C:\\Users\\ELHASSMU\\Desktop\\issue 1\\Export\\Issue_1_job_0.1\\Issue_1_job\\run.bat %1"

Note that the path of the batch file has double back slashes, if you do not do so it will not work.
Save your file then double click on it to import in registry.

To remove the registry create another file called for example  "Remove_Registry" always with the extention ".reg" and write this code in it :

Windows Registry Editor Version 5.00

[-HKEY_CLASSES_ROOT\*\shell\Run Issue_2]

Save your file then double click on it to remove the registry.

All you have to do now is to select one or multiple "CSV" files, right click on them and choose "Run Issue_2". This will launch the job on them and generate output files.

















 

4. Conclusion

This tutorial is finished. This tutorial is finished. I hope that the steps was clear and you will be able to reproduce it on your machine. If not then do not hesitate to ask questions or leave comments or relevant critics and thank you for reading the entire tutorial.

Comments