Chapter
17 : Rank Transformation
- Active and connected.
- We can select top 10 or bottom 10 rows .. thus the number of rows decreases and thus it is an active transformation.
- During a session, the Integration Service compares an input row with rows in the data cache. If the input row out-ranks a cached row, the Integration Service replaces the cached row with the input row.
- If you configure the Rank transformation to rank across multiple groups (Using Group by in SQL), the Integration Service ranks incrementally for each group it finds.
- The Integration Service stores group information in an index cache and row data in a data cache. If you create multiple partitions in a pipeline, the Integration Service creates separate caches for each partition.
- Following is what we do when we create a Rank transformation:
♦
Enter
a cache directory.
♦
Select
the top or bottom rank.(TYpe
of Rank)
♦
Select
the input/output port that contains values used to determine the
rank. You
can select only one port to define a rank.
♦
Select
the number of rows falling within a rank.
♦
Define
groups for ranks (define
the columns to be used for GROUP BY)
- Following are the types of ports on the Rank Transformation:
- Input: Minimum one: Create an input port to receive data from anothertransformation.
- Output: Minimum one: Create an output port for each port you want to link to another transformation. You can designate input ports as output ports.
- Variable Ports: Not Required: Can use to store values or calculations to use in an expression. Variable ports cannot be input or output ports. They pass data within the transformation only.
- Rank Port: Only one: Use to designate the column for which you want to rank values. You can designate only one Rank port in a Rank transformation. The Rank port is an input/output port.
You
must link the Rank port to another transformation.
- Rank index:The Designer creates a RANKINDEX port for each Rank transformation. The Integration Service uses the Rank Index port to store the ranking position for each row in a group.The RANKINDEX is an output port only. You can pass the rank index to another transformation in the mapping or directly to a target.
If
two rank values match, they receive the same value in the rank index
and the transformation skips the next value. Just like Rank SQL
function.
- Defining Groups:Like the Aggregator transformation, the Rank transformation lets you group information.Example: If you want to select the 10 most expensive items by manufacturer, you would first define a group for each manufacturer.
- Steps to create a Rank Transformation:
- In the Mapping Designer, click Transformation > Create. Select the Rank transformation. Enter a name for the Rank. The naming convention for Rank transformations is RNK_ TransformationName.
- Link columns from an input transformation to the Rank transformation.
- Click the Ports tab and select the Rank (R) option for the rank port.
- If you want to create groups for ranked rows, select Group By for the port that defines the group.
- Click the Properties tab and select whether you want the top or bottom rank.
- For the Number of Ranks option, enter the number of rows you want to select for the rank
- Following describes the properties tab in more detail:SettingMeaningCache DirectoryLocal directory where the Integration Service creates the index and data cache files. By default, the Integration Service uses the directory entered in the Workflow Manager for the process variable $PMCacheDir.Top/BottomSpecifies whether you want the top or bottom ranking for a column.Number of RanksNumber of rows you want to rank.Tracing LevelDetermines the amount of information the Integration Service writes to the session log about data passing through this transformation in a session.Rank Data Cache SizeData cache size for the transformation. Default is 2,000,000 bytes.You can configure a numeric value, or you can configure the Integration Service to determine the cache size at runtime. If you configure the Integration Service to determine the cache size, you can also configure a maximum amount of memory for the Integration Service to allocate to the cache.Rank Index Cache SizeIndex cache size for the transformation. Default is 1,000,000 bytes. You can configure a numeric value, or you can configure the Integration Service to determine the cache size at runtime. If you configure the Integration Service to determine the cache size, you can also configure a maximum amount of memory for the Integration Service to allocate to the cache.Transformation ScopeSpecifies how the Integration Service applies the transformation logic to incoming data:
- Transaction. Applies the transformation logic to all rows in a transaction. Choose Transaction when a row of data depends on all rows in the same transaction, but does not depend on rows in other transactions.
- All Input. Applies the transformation logic on all incoming data. When you choose All Input, the PowerCenter drops incoming transaction boundaries.
It was really a nice article and i was really impressed by reading this Informatica Online Course
ReplyDelete