Source Qualifier Transformation.


Chapter 21. Source Qualifier Transformation


  • Active and Connected.

  • The Source Qualifier transformation represents the rows that the Integration Service reads when it runs a session.

  • We can use SQ to complete following tasks:
  1. Join data originating from the same source database.
  2. Filter rows when the Integration Service reads source data.(by making use of filter condition).
  3. Specify an outer join rather than the default inner join./*could not test*/
  4. Specify sorted ports.
  5. Select only distinct values from the source.
  6. Create a custom query to issue a special SELECT statement for the Integration Service to read source data.

  • Transformation data types:
    The Source Qualifier transformation displays the transformation datatype which determine how source database binds data when the Integration Service reads it. If the datatypes in the source definition and Source Qualifier transformation do not match, the Designer marks the mapping invalid when you save it.

  • Target Load order:
    We specify a target load order based on the Source Qualifier transformations in a mapping. If you have multiple Source Qualifier transformations connected to multiple targets, you can designate the order in which the Integration Service loads data into the targets.

  • Date Time values:
    When you use a datetime value or a datetime parameter or variable in the SQL query, change the date format to the format used in the source. Some databases require you to identify datetime values with additional punctuation, such as single quotation marks or database specific functions. For example, to convert the $$$SessStartTime value for an Oracle source, use the following Oracle function in the SQL override:
    to_date (‘$$$SessStartTime’, ‘mm/dd/yyyy hh24:mi:ss’)

  • Using Parameters and Variables in source qualifier:
 You can use parameters and variables in the SQL query, user-defined join, source filter, and pre- and post-session SQL commands of a Source Qualifier transformation.
You can enter a parameter or variable within the SQL statement, or you can use a parameter or variable as the SQL query. For example, you can use a session parameter, $ParamMyQuery, as the SQL query, and set $ParamMyQuery to the SQL statement in a parameter file.

  • Source Qualifier transformation properties:

Setting Description
SQL Query
Defines a custom query that replaces the default query the Integration Service uses to read data from sources represented in this Source Qualifier transformation
User-Defined Join
Specifies the condition used to join data from multiple sources represented in the same Source Qualifier transformation.
Source Filter
Specifies the filter condition the Integration Service applies when querying rows.
Number of Sorted Ports
Indicates the number of columns used when sorting rows queried from relational sources. If you select this option, the Integration Service adds an ORDER BY to the default query when it reads source rows.
Tracing Level
Sets the amount of detail included in the session log when you run a session containing this transformation.
Select Distinct
Specifies if you want to select only unique rows.
Pre-SQL
Pre-session SQL commands to run against the source database before the Integration Service reads the source.
Post-SQL
Post-session SQL commands to run against the source database after the Integration Service writes to the target.
Output is Deterministic
Relational source or transformation output that does not change between session runs when the input data is consistent between runs. When you configure this property, the
Integration Service does not stage source data for recovery if transformations in the pipeline always produce repeatable data.
Output is Repeatable
Relational source or transformation output that is in the same order between session runs when the order of the input data is consistent. When output is deterministic and output is
repeatable, the Integration Service does not stage source data for recovery.


    What is a Default Query?

    For relational sources, the Integration Service generates a query for each Source Qualifier transformation when it runs a session.
    The default query is a SELECT statement with only the columns that are connected to another transformation.






Although there are many columns in the source definition, only three columns are connected to another transformation. In this case, the Integration Service generates a default query that selects only those three columns:

SELECT CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.FIRST_NAME FROM CUSTOMERS;

You can view the default query in the Source Qualifier transformation. From the Properties tab, select SQL Query and then click Generate SQL.

*******************************************

Can we override the default query generated by Source Qualifier?


You can alter or override the default query in the Source Qualifier transformation by changing the default settings of the transformation properties.
Do not change the list of selected ports or the order in which they appear in the query. This list must match the connected transformation output ports.

The SQL Query overrides the User-Defined Join, Source Filter, Number of Sorted Ports, and Select Distinct settings in the Source Qualifier transformation.

************************************


How do we join data in Source Qualifier?

    We can use one Source Qualifier transformation to join data from multiple relational tables. These tables must be accessible from the same instance or database server. Thus tables must be in the same database instance to be able to join them in SQ. If not then we have to use Joiner Transformation. We call these as heterogeneous joins.

To perform a heterogeneous join, use the Joiner transformation.
Join data from different source databases
Join data from different flat file systems
Join relational sources and flat files

What is Default Join?

When you join related tables in one Source Qualifier transformation, the Integration Service joins the tables based on the related keys(PK-FK) in each table.

This default join is an inner equijoin, using the following syntax in the WHERE clause:
Source1.column_name = Source2.column_name

The columns in the default join must have:

♦ A primary key-foreign key relationship
♦ Matching datatypes.



CUSTOMERS and ORDERS tables have PK-FK relation.

We join query can also be viewed by doing “GENERATE SQL”.
In above case the query would look like:

SELECT CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.FIRST_NAME, CUSTOMERS.LAST_NAME, CUSTOMERS.ADDRESS1, CUSTOMERS.ADDRESS2, CUSTOMERS.CITY, CUSTOMERS.STATE, CUSTOMERS.POSTAL_CODE, CUSTOMERS.PHONE, CUSTOMERS.EMAIL,
ORDERS.ORDER_ID, ORDERS.DATE_ENTERED, ORDERS.DATE_PROMISED, ORDERS.DATE_SHIPPED, ORDERS.EMPLOYEE_ID, ORDERS.CUSTOMER_ID, ORDERS.SALES_TAX_RATE, ORDERS.STORE_ID FROM CUSTOMERS, ORDERS
WHERE CUSTOMERS.CUSTOMER_ID=ORDERS.CUSTOMER_ID

The WHERE clause is an equijoin that includes the CUSTOMER_ID from the ORDERS and CUSTOMER tables.

You might need to override the default join under the following circumstances:
Columns do not have a primary key-foreign key relationship.
The datatypes of columns used for the join do not match.
You want to specify a different type of join, such as an outer join.


    We can create primary key-foreign key relationships as shown in the above screen shot in the Source Analyzer by linking matching columns in different tables. These columns do not have to be keys, but they should be included in the index for each table.
      For ex: The ORDERS and PAYMENTS tables do not share primary and foreign keys. Both tables, however, include a DATE_SHIPPED column. You can create a primary key-foreign key relationship in the metadata in the Source Analyzer.

You create a relationship between the ORDERS and PAYMENTS tables by linking the DATE_SHIPPED columns. The Designer adds primary and foreign keys to the DATE_SHIPPED columns in the ORDERS and PAYMENTS table definitions.




 The primary key-foreign key relationships exist in the metadata only. You do not need to generate SQL or alter the source tables.

Once the key relationships exist, use a Source Qualifier transformation to join the two tables. The default join is based on DATE_SHIPPED.


How to override the default Query?

    1. Open the Source Qualifier transformation, and click the Properties tab.
    2. Click the Open button in the SQL Query field. The SQL Editor dialog box appears.
    3. Click Generate SQL.
    The Designer displays the default query it generates when querying rows from all sources included in the Source Qualifier transformation. When the Designer generates the default query, it incorporates all other configured options, such as a filter or number of sorted ports.
    4. Enter a query in the space where the default query appears.
    Every column name must be qualified by the name of the table, view, or synonym in which it appears. For example, if you want to include the ORDER_ID column from the ORDERS table, enter ORDERS.ORDER_ID.
When creating a custom SQL query, the SELECT statement must list the port names in the order in which they appear in the transformation.

        5. Select the ODBC data source containing the sources included in the    query.
        6. Enter the user name and password to connect to this database.
        7. Click Validate.
        The Designer runs the query and reports whether its syntax was correct.
       8.Click OK to return to the Edit Transformations dialog box. Click OK  again to return to the Designer.


How to we enter User-Defined Joins?


  • Entering a user-defined join is similar to entering a custom SQL query, but we only enter the contents of the where clause.

  • When you add a user-defined join, the Source Qualifier transformation automatically includes the setting in the default SQL query. However, if you modify the default query after adding a user-defined join, the Integration Service uses only the query defined in the SQL Query property of the Source Qualifier transformation.

  • We can even use parameters or variables for user defined join or include parameters and variables within the join.

Important note: Do not enter the keyword WHERE at the beginning of the join. The Integration Service adds this keyword w hen it queries rows.

Example: In the below example, we have two tables TCCC800_ADDRESS_HISTORY and TCCC806_ADDRESS being joined using user defined join.
In the user defined join value we can use the point-and-click method to create the join condition.






When you generate the default query we can see that the join condition is automatically added to the query. Also note that to generate the default query the source qualifier must be connected to next transformation.

Informatica join syntax:
When you enter join syntax, use the Informatica or database-specific join syntax. When you use the Informatica join syntax, the Integration Service translates the syntax and passes it to the source database during the session. Always advised to use database specific Syntax.

When you use Informatica join syntax, enclose the entire join statement in braces ({Informatica syntax}). When you use database syntax, enter syntax supported by the source database without braces.

When using Informatica join syntax, use table names to prefix column names. For example, if you have a column named FIRST_NAME in the REG_CUSTOMER table, enter “REG_CUSTOMER.FIRST_NAME” in the join syntax.

Following is is the list of places where we can add the join syntax in source Qualifier transformation.
  1. User Defined join: The Integration Service append s the join override to the WHERE or FROM clause of the default query.
  2. SQL query: Enter join syntax immediately after the WHERE in the
    default query.

When you combine joins, enter them in the following order:
1. Normal
2. Left outer
  1. Right outer

You can create a normal join using the join condition in a source qualifier. However, if you are creating an outer join, you need to override the default join to perform an outer join.
I WAS NOT ABLE TO TEST OUTER JOIN OVERRIDE IN THE USER DEFINED JOIN.


How do we enter the Source filter?

  • We can use Source filter on properties tab to filter rows.

  • We should not include “WHERE” in the condition.

  • The Source Qualifier transformation includes source filters in the default SQL query. If, however, you modify the default query after adding a source filter, the Integration Service uses only the query defined in the SQL query portion of the Source Qualifier transformation.

  • You can use a parameter or variable as the source filter or include parameters and variables within the source filter.

In below example we we have added the Source Filter.






How to use Sorted Ports:

  • When you use sorted ports, the Integration Service adds the ports to the ORDER BY clause in the default query.

  • We might use sorted ports to improve performance when you include aggregator or joiner in our mapping.

  • Use Sorted ports only for relational sources.
The Source Qualifier transformation includes the number of sorted ports in the default SQL query. However, if you modify the default query after choosing the Number of Sorted Ports, the Integration Service uses only the query defined in the SQL Query property.




As we can see When we add number of Sorted ports as 5, the default query automatically incorporates the ORDER BY clause on the first 5 ports.


How do we use the Distinct option?

  • If you want the Integration Service to select unique values from a source, use the Select Distinct option.

  • If you choose Select Distinct, the Source Qualifier transformation includes the setting in the default SQL query.

  • However, if you modify the default query after choosing Select Distinct, the Integration Service uses only the query defined in the SQL Query property.



Explain Pre Session and Post Session Command Task:


We can add pre- and post-session SQL commands on the Properties tab in the Source Qualifier transformation.

The Integration Service runs pre-session SQL commands against the source database before it reads the source. It runs post-session SQL commands
against the source database after it writes to the target.

You can override the SQL commands in the Transformations view on the Mapping tab in the session properties.
You can also configure the Integration Service to stop or continue when it encounters errors running pre- or post-session SQL commands.
This Can be done on the config tab of session.

You can use parameters and variables in source pre- and post-session SQL commands, or you can use a parameter or variable as the command.

Use a semicolon (;) to separate multiple statements. The Integration Service issues a commit after each statement.

The Integration Service ignores semicolons within /*...*/.

The Designer does not validate the SQL.

Pre Session example:

Pre-Post session Sql error handling in session:

Notes:
The Designer does not allow you to connect multiple Source Qualifier transformations to a single target. There are two workarounds:

Reuse targets. Since target definitions are reusable, you can add the same target to the mapping multiple times. Then connect each Source Qualifier transformation to each target.

Join the sources in a Source Qualifier transformation. Then remove the WHERE clause from the SQL query.





4 comments:

  1. Hi I have a question, Say in source definition for a column the data type is char and in source qualifier the data type is small integer.and from database where i am reading, column is char, Then while running workflow will it fail?

    ReplyDelete
  2. Thanks for explaining the source qualifier in detail.

    ReplyDelete
  3. Thank you.Well it was nice post and very helpful information on Informatica Online Training Hyderabad

    ReplyDelete
  4. It is a relevant website for me and it made my day . do Americans need visa for Turkey. yes of course , why not. Turkish visa is compulsory for American citizens to visit turkey.Americans play a very important role in Turkey Tourism .

    ReplyDelete