The list of available Custom Steps further down on this page uses the following abbreviations to group related steps.
Abbreviation | Explanation |
---|---|
CAS | Steps in this category provide utilities for working with data in CAS |
CV | Computer Vision |
DQ | Data Quality |
LLM | Large Language Model |
NLP | Natural Language Processing |
OCR | Optical Character Recognition |
SDG | Synthetic Data Generation |
Name | Brief Description | Owner/Contact | Viya Version Supported | Last Update |
---|---|---|---|---|
_template | Template to use for contributions | SAS | 2020.1.5 or later |
04OCT2024 |
Airflow - Generate DAG | Generates an Apache Airflow DAG using SAS Studio Flow where flow steps represent Airflow tasks using the SAS Airflow Provider | Nicolas Robert | 2023.12 or later |
21FEB2024 |
Anonymize and Mask Data | Anonymize and Mask Data using QKB definitions | Mary Kathryn Queen | 2023.06 or later |
19FEB2024 |
Append Table | Appends data to a target table with support for maintaining unique incremental id | Torben Juul Johansson | 2020.1.5 or later |
26OCT2022 |
CAS - Convert Char to Varchar | Create a copy of a table and convert chars to varchars | Carlo Petti | 2023.01 or later |
05SEP2023 |
CAS - Generate unique ID | Generates a new column containing a unique identifier (ID) per observation for a given input CAS table | Sundaresh Sankaran | 2022.11 or later |
07FEB2023 |
CAS - Load Tables from Folders in Filesystem | Load all files in a directory to CAS tables | Sundaresh Sankaran / Wilbram Hazejager | 2022.11 or later |
21DEC2022 |
CAS - Submit Python and R code | Submit Python/R code to CAS server using CAS Gateway action | David Weik | 2023.11 or later |
11JAN2024 |
CAS - Validate unique ID | Validates if column contains unique values for a given input CAS table | Sundaresh Sankaran | 2023.06 or later |
11JUL2023 |
Catalog - List Agents | Extract list of configured SAS Information Catalog agents into a table | David Weik | 2023.12 or later |
15JAN2024 |
Catalog - Run Agent | Triggers the run of a SAS Information Catalog agent) | David Weik | 2023.12 or later |
15JAN2024 |
Create Listing Of Directory CLOD | Create table containing names of all files in directory | Stephan Weigandt | 2020.1.5 or later |
25SEP2024 |
CV - Create Object Detection Table | Create CAS table (training dataset) with images and labels for use with CAS actions for Computer Vision (CV) and for use with SAS DLPy Python library | Neela Vengateshwaran | 2024.01 or later |
21JUL2024 |
CV - Load Images | Load image files into a CAS table for use with CAS image analytics actions | Robert W Blanchard | 2024.05 or later |
04OCT2024 |
CV - Merge Data with Images | Merge or append data that contains images | Robert W Blanchard | 2024.05 or later |
19SEP2024 |
CV - Train Models | Develop Computer Vision (CV) models to accomplish one of four prediction tasks: 1. Image classification, 2. Image regression, 3. Object detection, 4. Multi-task | Robert W Blanchard | 2024.05 or later |
19SEP2024 |
Data Synthesis with Python faker | Generate synthetic data using Python faker module (also includes custom steps to install/load Python modules) | Angus Looney / Duncan Bain | 2021.1.1 or later |
22DEC2022 |
Detect Data Drift | Calculate metrics for tracking changes between two groups of records (representing two time intervals) inside a table | David Weik | 2024.07 or later |
15AUG2024 |
Download Job Execution Log | Store Job Execution Log in a user-specified location on SAS Compute Server file system | Remco Gooijer | 2024.01 or later |
03NOV2024 |
DQ - Cluster Analysis | Compare pairs of rows in a cluster and identify potential false positives | Clemens Knobloch | 2023.10 or later |
23JAN2024 |
DQ - Change Case | Upper-, Lower-, or Propercase data values using QKB locale specific rules | Clemens Knobloch | 2023.01 or later |
15MAR2023 |
DQ - Clustering | Cluster records based on column values, eg. match codes | Lorenzo Toja / Arnold Toporowski / Nikolaus Hartung | 2021.1.1 or later |
05AUG2024 |
DQ - Create QKB Reference Tables | Create QKB Reference Tables | Mary Kathryn Queen | 2023.06 or later |
10OCT2023 |
DQ - Identify | Obtain the Identity Type for of data values using the dqIdentify function | Arnold Toporowski | 2021.1.1 or later |
29NOV2022 |
DQ - Match Code | Create Match Codes based on locale, using SAS QKB and dqMatch function | Lorenzo Toja / Arnold Toporowski | 2021.1.1 or later |
19MAR2024 |
DQ - Parsing | Parse a string into a set of tokens using QKB locale specific rules | Clemens Knobloch | 2023.01 or later |
15MAR2023 |
DQ - Standardize | Create standardized values based on locale, using SAS QKB and dqStandardize function (includes support for generating masked values) | Lorenzo Toja / Arnold Toporowski | 2024.01 or later |
21MAR2024 |
DQ - Surviving Record | Extract the best record (aka. Golden Record) from clusters of records, with support for standard deduplication routines and user-defined rules | Lorenzo Toja | 2023.06 or later |
24AUG2023 |
DuckDB | Uses DuckDB to read and write data in various DBMS and file types | Clemens Knobloch | 2023.01 or later |
12JUL2023 |
Dynamic Aggregations From Timeseries DAFT | Perform dynamic aggregations on timeseries data | Stephan Weigandt | 2022.1.2 or later |
08MAY2023 |
Export - ADLS File Writer | Write SAS tables to Parquet files on Azure Data Lake Storage (ADLS) | Alfredo Lorie | 2023.02 or later |
20APR2023 |
Export - GCSFS File Writer | Write SAS and CAS datasets to Google Cloud Storage (GCS) in Parquet and Delta Lake format using Python | Ignacio Rodríguez | 2023.11 or later |
21DEC2023 |
Export - Parquet | Export SAS tables to Parquet files using SAS Libname engine | Neil Griffin | 2023.10 or later |
15DEC2023 |
Extract Job Definition Metadata | Extract all content (SAS Code, HTML forms, XML Prompts and Parameters) from a SAS Job Definition and store it in files | David Weik | 2023.11 or later |
23NOV2023 |
Extract Text Features | Supports extracting many different features from text fields | David Weik / Ulrich Reincke / Rens Feenstra | 2022.10 or later |
27NOV2022 |
Extract VS Report Metadata | Extract info about CAS tables and columns used, calculated column definitions and their usage, across objects in VA report | Remco Gooijer | 2023.05 or later |
16APR2024 |
FTP Directory Listing | Create a table containing names of files in an FTP directory | Remco Gooijer | 2023.01 or later |
14AUG2024 |
FTP Download Files | Download file from an FTP Server, where list of files to download is provided in an input table | Remco Gooijer | 2023.01 or later |
14AUG2024 |
GEO - Shape Files | Manage Shape Files used in GIS systems for use in SAS Visual Analytics | Stefano Tucciarone | 2023.10 or later |
18APR2024 |
GeoDistance with Rounding | Calculate the distance between 2 supplied lat/long locations in either kilometers or miles | Mary Kathryn Queen | 2020.1.5 or later |
28SEP2022 |
Get Exchange Rates | Get Exchange Rates from Service Provider | David Weik | 2023.08 or later |
04SEP2023 |
Git - Clone Git Repo | Clone Git Repo as part of running a flow | Sundaresh Sankaran | 2022.11 or later |
25JAN2023 |
Git - Delete Local Repo | Delete LOCAL Repo as part of running a flow | Sundaresh Sankaran | 2022.11 or later |
25JAN2023 |
Git - List Local Repo Changes | List changed files inside local Git repository folder into a dataset as part of running a flow for easy reporting | Sundaresh Sankaran | 2022.11 or later |
07FEB2023 |
Git - Stage, Commit, Pull and Push Changes | Perform stage, commit, pull and push changes as part of running a flow | David Weik | 2023.01 or later |
19FEB2023 |
Great Expectations - Execute Rule | Run business rules based on Great Expectations Python package | Stephen Kotiang | 2023.03 or later |
11OCT2023 |
Great Expectations - Generate Expectation Suite | Generate rules on input data using Great Expectations | Mackenzie Looney | 2023.04 or later |
19OCT2023 |
Great Expectations - Run Expectations Suite | Compare data against an Expectation Suite | Mackenzie Looney | 2023.04 or later |
19OCT2023 |
HTTP Request | Send HTTP requests using proc HTTP and use URL parameter values from input table | Clemens Knobloch | 2024.06 or later |
17OCT2024 |
Import - ADLS File Reader | Read Parquet files from Azure Data Lake Storage (ADLS) with support for Delta Lake file format | Alfredo Lorie | 2023.02 or later |
25APR2023 |
Import - CSV with long column names | Import CSV file with long column names (>32 chars) in header row | Ignacio Rodríguez | 2023.11 or later |
15JAN2024 |
Import - Data Ingestion Auto Pilot DIAP (Light) for External Files | Ingest external file(s) from directory with push of a button | Stephan Weigandt | 2020.1.5 or later |
04OCT2024 |
Import - Extract Table from PDF | Extract tabular data from within a PDF document and load the same to a SAS dataset | Sundaresh Sankaran / Dragos Coles | 2023.03 or later |
01MAY2023 |
Import - GCSFS File Reader | Read Parquet and Delta Lake files from Google Cloud Storage (GCS) and write to SAS and CAS datasets using Python | Ignacio Rodríguez | 2023.11 or later |
21DEC2023 |
Import - Google Sheets | Import public Google Sheets as a SAS data set | David Weik | 2022.12 or later |
12JAN2023 |
Import - HTML Table | Import HTML table(s) from web page as SAS data set(s) using Python Pandas | David Weik | 2023.07 or later |
28JUL2023 |
LLM - Azure OpenAI RAG | Uses a Retrieval Augmented Generation (RAG) approach to provide right context to an Azure OpenAI Large Language Model (LLM) for purposes of answering a question | Samiul Haque / Sundaresh Sankaran | 2024.01 or later |
10JUL2024 |
LLM - Prompt Catalog | Submit queries to a Large Language Model, test prompts (prompt engineering) and save prompt history | Xin Ru Lee | 2023.12 or later |
24JUL2024 |
Log File Scraper | Extract ERRORS and WARNINGS from one or more SAS log files and makes them available in a table | Remco Gooijer | 2021.1.5 or later |
08NOV2024 |
Lookup | Add column by performing lookup on other table (using data step hash object) | Torben Juul Johansson | 2021.2.1 or later |
21SEP2022 |
Loop Deployed Object | Parallel execution of a deployed flow or a SAS program for a given set of parameter values | Remco Gooijer | 2023.01 or later |
03OCT2024 |
Loop Flow | Iteratively execute another Studio Flow, much like Loop/Loop End in SAS Data Integration Studio | Torben Juul Johansson | 2023.09 or later |
08MAR2024 |
NLP - Categories Testing Framework | Test and assess SAS Visual Text Analytics categorization model | Sundaresh Sankaran | 2023.12 or later |
10JAN2024 |
NLP - Extract Identities | Pull entities out of documents or freeform text | Arnold Toporowski | 2022.12 or later |
13DEC2023 |
NLP - Extract Rule Configuration | Extracts the rule configuration within rules-based Visual Text Analytics Concepts or Categories model definitions for use in downstream applications. | Sundaresh Sankaran | 2023.04 or later |
01AUG2023 |
NLP - Identify Language | Identifies the language used for text data in an input table and create a column containing the ISO 639-1 language code. | Sundaresh Sankaran | 2022.12 or later |
15FEB2023 |
NLP - Predefined Sentiment Analysis | analyse a text corpus for the sentiment expressed in it | Sundaresh Sankaran | 2023.03 or later |
04APR2023 |
NLP - Profile Text | Profile text within a document corpus and understand its linguistic structure | Sundaresh Sankaran | 2022.07 or later |
19OCT2022 |
NLP - Score Text Classifier | Score a text corpus with a text classifier model trained using the deep learning (BERT-based) textClassifier.trainTextClassifier CAS action | Sundaresh Sankaran | 2023.02 or later |
19MAR2023 |
NLP - Sentence Splitter | Splits a text column into multiple observations with constituent sentences using CAS actions | Sundaresh Sankaran | 2023.08 or later |
03NOV2023 |
NLP - Train Text Classifier | Train a text classifier model based on deep learning (BERT-based transformer) architecture using textClassifier.trainTextClassifier CAS action (supports GPUs) | Sundaresh Sankaran | 2023.02 or later |
18MAR2023 |
OCR - AWS Textract | Use the AWS Textract service to perform different types of OCR on files that can be stored in S3 buckets or on the SAS Compute file system | Jannic Horst | 2022.09 or later |
08JAN2024 |
OCR - Azure AI Document Intelligence | Uses Microsoft Azure's Document Intelligence service to perform different types of OCR for files stored in the SAS Server file system or stored on a URL. | Sundaresh Sankaran / Jannic Horst | 2024.02 or later |
22MAR2024 |
OCR - Document Analysis - Execute Batch OCR Process | Uses SAS Document Analysis to perform a batch run on files stored in the SAS Server file system. | William Nadolski | 2024.08 or later |
09OCT2024 |
OCR - Document Analysis - Produce Usage Report Output | Uses SAS Document Analysis to to generate a usage report on previous batch processes. | William Nadolski | 2024.08 or later |
29OCT2024 |
Python - Load Objects to SAS | Load Python objects to SAS Compute or CAS tables | Sundaresh Sankaran | 2023.08 or later |
01SEP2022 |
Python - Virtual Environments | A collection of 5 SAS Studio custom steps which help you create, activate, and switch between virtual Python environments for use within SAS Viya | Sundaresh Sankaran | 2020.1.5 or later |
12JUL2022 |
R Runner | Submit R scripts with support for input and output table | Samiul Haque / Sundaresh Sankaran | 2023.08 or later |
18AUG2023 |
Rank Columns - Starter template | Simple Example (based on template) | SAS | 2020.1.5 or later |
26AUG2022 |
SAS Content - Copy File from File System | Copy file from Compute file system into SAS Content folder programmatically | Sundaresh Sankaran | 2022.11 or later |
09JAN2024 |
SAS Content - Create Folder | Creates a new folder in SAS Content programmatically | Sundaresh Sankaran | 2022.11 or later |
18DEC2023 |
SAS Content - Obtain Folder URI | Obtain URI of selected SAS Content folder and save it in a global macro variable | Sundaresh Sankaran | 2022.11 or later |
18DEC2023 |
SCD Loader | Slowly Changing Dimensions loader with support for type 1 and type 2 changes | Torben Juul Johansson | 2020.1.5 or later |
28SEP2022 |
SDG - Generate Synthetic Data through GANs | Generate synthetic data using a trained GAN model | Sundaresh Sankaran | 2022.09 or later |
04NOV2024 |
SDG - Generate Synthetic Data through SMOTE | Generate synthetic data based on an input table, using the Synthetic Minority Oversampling TEchnique (SMOTE). | Sundaresh Sankaran | 2024.10 or later |
02NOV2024 |
SDG - - Train a Synthetic Data Generator through GANs | A collection of 3 steps which help you train, score and assess Synthetic Data models | Sundaresh Sankaran | 2024.10 or later |
04NOV2024 |
Send SMTP Email | Send Email message | Mary Kathryn Queen | 2022.1.4 or later |
03APR2023 |
Send Teams Message | Send Microsoft Teams Messages to a Teams channel | David Weik / Tamara Fischer | 2022.10 or later |
14JUN2023 |
Surrogate Key Generator | Generates a surrogate key based on a business key | Torben Juul Johansson | 2020.1.5 or later |
29SEP2022 |
Translate Text | Translates text stored in a column using DeepL API | David Weik | 2023.04 or later |
10MAY2023 |
Update column labels | Update column labels from a (metadata) table, delimited file, or interactively | Ignacio Rodríguez | 2023.11 or later |
21DEC2023 |
Vector Databases - Hydrate Chroma DB Collection | Populate a Chroma vector database collection with documents and embeddings contained in a CAS table | Sundaresh Sankaran | 2023.12 or later |
24JAN2024 |
Vector Databases - Query Chroma Chroma DB Collection | Query a Chroma vector database collection with documents and store results in CAS table | Sundaresh Sankaran | 2023.12 or later |
30JAN2024 |
Vector Search - Fast KNN | Identify nearest neighbors to observations in an input query table | Sundaresh Sankaran | 2023.11 or later |
09FEB2024 |
Zip data | Move or copy files residing on the Compute file system to zip files using Python | Neil Griffin | 2024.05 or later |
25JUN2024 |