New Article

Talend Certification Tips - Tip #002 : Know your tMap Joins

Image
Hi Talend Addicts, Ready to learn more about the Talend great ETL tool! This post is part of the new series discussing the most important things (Tips) to know for anybody who want to tackle the Talend Open Studio for Data Integration Certification Exam. Today's Tip is:  Tip #002: Know your tMap Joins! tMap is one, if not THE, most important component in a Talend Job.  P ersonally  I have never seen any Talend Job without at least one occurence of this component, and  believe  it or not many questions around this component on the certification exam! the tMap can be used to (obviously) map data from input to output, but even more. Map data - already said that, but very important, so I say it again! Filter input data Filter output data Transform data Join data In today's post I'll focus on the last but not least one - JOIN Data So let's kick this off with some examples; and see what are the different types of joins you can build with Talend tMap component. Setting up the

Talend Certification Tips - Tip #001 : Group of Contexts, Contexts & Context Variables

Hi Talend Addicts,

It has been a while since my last blog about Talend :) But I'm back to share more knowledge about this great ETL tool!

In fact I will start a new series discussing about the most important things (Tips) to know for anybody who want to tackle the Talend Open Studio for Data Integration Certification Exam using some concrete examples with screenshots and step by step guidelines.

Let's kick this off with this first blog in a future long series of blogs.

The Tip #001 I want to talk about is the difference between Group of contexts, Contexts & Context Variable. Based on my experience many new people to Talend are confusing those three elements.

First, you need to know that A Group of context is composed of many contexts. In other words, think about the Group of context as the Parent and contexts as Childs.
And Secondly, a Group of context contains many context variables that will have different values depending on the Context.

A picture is worth a thousand words, right?


So as you can see above, if you create a Group of Contexts (in red) you can add to it many Contexts (in green: myContext1, myContext2, ...), and inside each context we can assign different values to the context variables (in blue: myContextVariable1, myContextVariable2, ...)

Ok, you might ask: "why do I care?" 
Let's take a concrete example in a real project.

You are asked by your client to build a Talend job that will migrate some data from point A to point B. Let's say for example from an On-Premise database to a Salesforce Instance. Until this point everything should be okay!

You create your Job and it's working perfectly in you DEV environment:
  • Your On-Premise database is the Production database
  • Your Salesforce Instance is the Dev database (a Salesforce Dev Sandbox to be more accurate)
So once the job is built & tested you want to move to a higher environment, for example UAT so users can run their tests! But Wait a minute, how can I achieve that without changing my Talend job?
  • This is where Group of Contexts, Contexts & Context Variables come to play.
Not clear yet?! Let's get our hands dirty & build this stuff. 

I've created a very simple job that is just displaying a String (but you'll get the idea). I want to be able to change the displayed message depending on the context without changing anything in my job. This is how I would do it:
  1. Create A new Group of Contexts from the Repository (left pane)
  2. Create two contexts: DEV & PROD
  3. Create a new Variable: salesforceURL of type String
    1. for DEV context: give it the value test.salesforce.com (salesforce testing sandbox URL)
    2. for PROD context: give it the value login.salesforce.com (salesforce production org URL)
Your Group of Context should look like this:


  1. Now let's create a new Job
  2. From the Contexts tab add the previously created Group of Contexts
  3. Drag a tJAVA component to it
  4. Paste the code below inside the tJAVA component

System.out.println("\n---------------------------------------------------------------\n");

System.out.println("This Job migrates data.");

System.out.println("From : Production Database");

System.out.println("To : " + context.salesforceURL + " Salesforce Instance");

System.out.println("\n---------------------------------------------------------------\n");


Now let's run our job with the two different contexts. To run the same job with different contexts all we have to do is to select the context from the drop down menu in the execute tab:
  • First we run it with the DEV context
  • Second we run it with the PROD context
As you will see, the same exact Talend Job now have a different behavior depending on the context it's run with (cf. image below and try it yourself).



Conclusion
Obviously our job is not doing much, but you can imagine how much this concept if powerful. With contexts you can create one single Talend Job and run it with many different configurations, in many different environments, producing many different results without changing anything.

Comments