Changelog

1.2.9

update: GITLAB token

1.2.8

update: AWS credentials and GITHUB token

1.2.7

update: DB user

1.2.6

replace: Console links with Studio links in pipelines

1.2.5

fix: Missing dependency in chrome installation

1.2.4

fix: Upgrade Selenium version

1.2.3

fix: Chrome and chromedriver issue for image build

1.2.2

update: Update Gitlab personal token

1.2.1

remove: Remove support for fetch_n_tag_calls and tag_calls

1.1.24

add: Support for a new column in conversations table

1.1.23

Bugfix: Fix missing metrics folder and s3 upload region

1.1.22

add: blocking_disposition is now being extracted as part of call-level tagging jobs

1.1.21

add: Pipeline to invalidate situations in DB for LLMs

1.1.20

add: New s3_buckets for sandbox regions in India and US

1.1.19

add: previous_disposition is now being extracted as part of call-level tagging jobs

1.1.18

Bugfix: fix issue with sample conversation display

1.1.17

Update skit-labels version

1.1.16

Bugfix: Fix db connection issue

1.1.15

add: Pipeline to generate conversations and upload it for tagging

1.1.14

Update github access token in secrets

1.1.12

Bugfix: fix issue with fetch_calls pipeline

1.1.11

Bugfix: fix issue with retrain_slu_old pipeline

1.1.10

Bugfix: update skit-calls version

1.1.9

add: Change node selector in US

1.1.8

add: support for displaying a sample of the conversation generated

1.1.7

fix: the handling of the format of scenario param in generate sample conversations pipeline

1.1.6

add: Pipeline for generating sample conversations

1.1.5

fix: No calls found from FSM shouldnt fail data_fetch pipelines

1.1.4

fix: Intent list was not coming in comprision file while training or evaluate slu.

1.1.3

Update node selector to match node groups in US

1.1.2

add environment variables in CI

1.1.1

add: Added re_presign_s3_urls component, which can re-presign publis s3 http urls in transcription_pipeline.
fix: join on dataframes bug fix at merge_transcription component.

1.1.0

add: Added evaluate_slu pipeline in the repo to test the slu
add: Added features of comparing repo while testing
fix: created multiple component to reduce the retendency and complaxity of repo.
fix: Moved common funtions in seperate utils files for retrain and evaluate slu.

1.0.7

Update credentials for accessing fsm DB
Add support for multiple flow ids for fetch call related pipelines

1.0.6

Bugfix: update skit-calls version

1.0.4

Bugfix: Dvc config updated only for new SKU repos

1.0.2

Add support for obtaining comparision confusion matrix while retraining SLU

1.0.0

Remove deprecated pipelines

0.2.105

Functionality to compare new trained SLU model with live model on test data (#115)

0.2.104

Fixes issue with chrome browser and driver mismatch (#116)

0.2.103

Add retry for LS uploads in skit-labels (#22)

0.2.102

Redact user info using presidio (#113)
updated aws creds

0.2.101

add: s3 client aws access id and keys secrets

0.2.99

alias_bug_fix: fixes issue in dowloading yaml file. (#112)
Update key related dependencies and add time logger (#110)

0.2.98

Add new pipeline to identify and persist ML compliance breaches in calls

0.2.97

fix: logs streaming during train and test (#108)

0.2.96

add: skit-calls version bump for presigned-url logic for turn audio url in US cluster

0.2.95

Refactor retraining pipelines for new SLU architecture (#107)

0.2.94

download_yaml: chnages in alias file download from gitlab repo. (#106), storage_full_fix: Removig old_data while velidating pipelines.

0.2.92

fix: template_id not working properly, changed skit-calls query secrets
PL-1300: new data format support & extra meta-annotation-columns (#21)

0.2.91

Extend GPT support for fetch_n_tag_calls and update prompts (#104)
transcription pipeline: add support for mp3 file format while overlaying transcriptions (#103)

0.2.90

update: Secrets for assisted annotation pipeline

0.2.89

add: Slu customization integration with retrain_slu pipeline (#102)
add: support for call level tagging jobs for tag_calls pipeline, alongwith turn level tagging

0.2.88

add: Merge pull request #99 from skit-ai/gpt_intents

0.2.87

add: validate setup component for retrain_slu pipeline, tests slu setup builds in cpu node

0.2.86

update: skit-calls and skit-labels version update
fix: alternatives column postprocessing after downloading from labelstudio, client_id not being truly optional for fetch_calls component

0.2.84

update: skit-calls version bump for reftime with timezone

0.2.83

update: skit-calls version bump and use_fsm_url as param for all fetch_calls related pipelines

0.2.82

add: comma separated list of client_id is supported, template_id to filter calls is supported. Both in fetch_calls component + also modified downstream pipelines using it

0.2.81

fix: upstream column names for entity tagged dataset
update: version bump for skit-calls which helps to sample turns in a call based on comma separated list of intents

0.2.80

[PL-997] adding more call metadata from upstream for CRR tagging

0.2.79

add: skit-calls version bump which includes new columns for fetch_calls component returned csv - call_type, disposition, call_end_status, flow_name

0.2.78

update: buffer for call_quantity for fetch_calls component

0.2.77

update: use_fsm_url flag based on region for deciding turn audio uri paths should be from fsm or s3 bucket directly
update: skit-calls version bump for use-fsm-url flag
update: org_auth_token component's output is optional since tog deprecated

0.2.76

fix: made console URL region specific to fix call & slot tagging job uploads (esp in US)

0.2.75

update: removed audio validation from fetch_calls component as min_duration is present now.

0.2.74

add: get pipeline run error logs as slackbot message (#93)

0.2.73

update: skit-calls version bump - calls-with-cors path removal for audio_url in fetched calls.

0.2.72

fix: granular time filters not getting applied for ml fetch calls pipelines - date offset

0.2.71

fix: granular time filters not getting applied for ml fetch calls pipelines

0.2.70

skit-calls version bump for minute offset to def process_date_filters - in fetch_calls component

0.2.69

revert: "cache the poetry install step for faster docker builds and dev experience (#90)" (#92)

0.2.68

update: skit-calls version bump to 0.2.27 for sampling call on min_duration

0.2.67

fix: Auto-MR creation breaks in SLU training pipeline when classification report / confusion matrix is long (#91)
update: use poetry version only what mentioned in SLU repo in retrain_slu pipeline
update: cache the poetry install step for faster docker builds and dev experience (#90)
fix: asr tune pull request #89 from skit-ai/asr_tune_hi_fix

0.2.66

update: upload same data to multiple labelstudio project ids

0.2.65

add: new component file_contents_to_markdown for better gitlab mr description with reports (#88)

0.2.64

fix: data_label bug

0.2.63

update: set default value of data_label as Live

0.2.62

fix: remove_empty_audios not setting to false for fetch_calls

0.2.61

add: remove_empty_audios param controllable for fetch_calls_pipeline

0.2.60

update: make docs

0.2.59

update: remove default option for data_label in all tag_call component involving pipelines

0.2.58

add: mandatory data_label field for tag_calls component
add: optional data_labels field for fetch_tagged_data_from_labelstore component
update: skit-labels version bump to 0.3.31

0.2.57

fix: skit-calls secrets query for language - version bump to 0.2.26

0.2.56

update: skit-calls version bump to 0.2.25
fix: upload2s3 folder upload bug

0.2.55

upload2s3: correct bug in upload_as_directory that leads to flattening of path_on_disk contents in the target output_path (#82)
change: changed the parser for Hindi from Unified Parser to Character split. This is a breaking change and support for ASR models hi-v4 and b elow is removed (#83)

0.2.54

add: new pipeline for pushing same data for intent, entities & slot/call tagging (#81)

0.2.53

fix: retrain slu pipeline fails when only s3 uri provided

0.2.52

update: label store data format and query changes (#80)

0.2.51

update: db_host const
add: timezone based on cluster region

0.2.50

add: fetch_tagged_data_from_labelstore pipeline (#79)
fix: intent column placeholder for slu train pipeline

0.2.49

fix: upstream changes made by fetch_tagged_dataset component (#78)

0.2.48

update: single function handling both tog and labelstudio data fetching - skit-labels version bump
update: skit-calls version bump to 0.2.24

0.2.47

fix: call_type bug for inbound/outbound only
fix: downgrade google-auth-oauthlib to 0.4.6 (#77)

0.2.46

update: for pipeines using fetch_calls component, call_type defaults to "inbound" and "outbound" both
update: skit-calls version bump to 0.2.23

0.2.45

fix: training pipeline bug

0.2.44

update: gitlab token secret

0.2.43

update: retrain_slu pipeline changes to align with new Deployment CI/CD for SLU (#76)
add: raise exception when no calls found/csv empty (#75)

0.2.42

update: skit-labels version bump 0.3.29
update: discrepancy check for raw.intent and intent column for test dataset in retrain_slu_from_repo component

0.2.40

update: force utterances as str in gen_asr_metrics component (#71)
add: checks for 0 byte audios before uploading for tagging (#72)
add: custom port support while testing pipeline
update: Extending retrain_slu pipeline features (#73)
fix: entity pipelines supporting labelstudio datasets (#74)

0.2.39

update: Asr tune pipeline enhancements (#70

0.2.38

update: secrets + us ml metrics db env vars
update: upload2s3 upload directories, asr_tune uploads directory and process true transcript in asr_eval_pipeline (#69)

0.2.37

fix: preprocess step skipped for lablestudio (#68)

0.2.36

update: irr fixes for labelstudio datasets (#66)

0.2.35

add: slack bots and channels based on cluster region

0.2.34

fix: again pipeline_constants not defined error in asr_tune component (#65)

0.2.32

fix: regionwise nodeselector for pipelines

0.2.31

fix: pipeline_constants not defined error in asr_tune component (#63)

0.2.30

fix: empty_possible param for download_file_from_s3 component

0.2.29

fix: involving nan + adjusting for interval types - fetch_tagged_entity_dataset pipeline
add: lm tuning pipeline (#58)
update: download_from_s3 component into downstream components
add: auto nodeselector based on pipeline type - cpu/gpu (#60)
update: fetch data from multiple job/project ids at once (combined data) (#61)
add: Transcription Pipeline V1 (#56)
add: retrain_slu pipeline for automated SLU retraining with deployment tracked using gitlab. (#62)

0.2.28

fix: if not project_id present then only force routes for selected clients - tag_calls.

0.2.27

fix: org_id type being different across components.

0.2.26

add: eer pipelines (#55)

0.2.25

add: fetch_tagged_entity_dataset pipeline. (#54)

0.2.24

update: force route tagging requests to labelstudio. (#53)

0.2.23

add: offset for pulling last n days fetch_tagged_dataset op, and thus used in irr_from_tog pipeline (#49)

0.2.22

add: Authentication and Authorization of requests using JWT - Oauth2

0.2.21

update: eevee yaml component for exhaustive possible intent metrics

0.2.19

update: version bump for skit-calls and skit-labels
update: new AUDIO_URL_DOMAIN secret for param in fetch_calls component

0.2.18

fix: bugs in eval_asr_pipeline

0.2.17

fix: slackbot command parser for base64 breaking when period (.) at end of text
add: run pipelines from func, without uploading yamls + easier dev workflow
add: asr-eval-pipeline phase 1
update: irr_from_tog now has mlwr=True arg (which also requires slu_project_name) for pushing calculated eevee metrics on a tog job to ml-metrics db, intent_metrics table

0.2.16

update: start_date and end_date optional for fetch_calls_pipeline, fetch_n_tag_calls, fetch_calls_n_push_to_sheets and fetch_calls_n_upload_tog_and_sheet

0.2.15

update: skit-labels version bump to 0.3.27 - tag_calls and fetch_n_tag_calls uploads data in batches of batched data with retries + sleep.

0.2.14

fix: call_type default arg from "inbound" to "INBOUND" for fetch_calls_pipeline
fix: eval_voicebot_xlmr_pipeline to work even when model is not present

0.2.13

update: skit-labels version bump to 0.3.26

0.2.12

add: irr_from_tog pipeline, it takes a tog job id/ labelstudio project id and outputs eevee's intent metrics uploaded to s3 & optionally posted on slack, an addition to eval_voicebot_xlmr_pipeline.

0.2.7

update: skit-calls version bump to 0.2.21

0.2.6

add: use slackbot reminders to run/schedule recurring pipelines
update: skit-calls version bump to 0.2.19

0.2.5

update: date time offsets for data pipelines.
update: tarfile uploads for directories.

0.2.4

fix: uploading dirs to s3
fix: tagging response defined before use.

0.2.3

update: xlmr eval output as csv.

0.2.2

fix: evaluation pipeline for intent evaluation on f1-score metric.

0.2.1

fix: training pipeline accomodates older/newer datasets.
update: tag calls raises exception if neither of tog/labelstudio ids are provided.
update: tag calls raises exception if no data was uploaded.

0.2.0

fix: slack parser handles urls in code blocks.
update: add slack thread id, channel and user id automatically.
update: slack notification component expects code_blocks instead of s3_path.
fix: slack parser compatible with python 3.8
fix: notfications are sent to slack threads if thread id is present.
feat: Notify users when a pipeline run is complete / failed.
update: tag calls returns multiple outputs.

0.1.99

fix: auth-token type.

0.1.98

fix: training pre-proc handles utterances from these columns as well: alternatives, data.

0.1.97

update: dockerfile uses python 3.8 and cuda 10.2.
fix: labelstudio download.
update: compatible with python 3.8.

0.1.96

update: normalize.comma_sep_str can be used for comma separated numbers and strings.

0.1.95

fix: TypeError: Object of type JSONDecodeError is not JSON serializable on dataset uploads.
fix: invalid literal for int() with base 10: '' on dataset uploads

0.1.94

update: skit-labels unpacks tagged dataset.
fix: 'AttributeError: 'NoneType' object has no attribute 'get' on fetching calls with failed predictions.
fix: compatibility between skit-labels, skit-calls, skit-auth, kfp, etc.
add: s3 url support for tag calls pipeline.

0.1.93

fix: Slack hyperlink markup removed.

0.1.92

add: Labelstudio integration.

0.1.91

update: fetch_calls_n_upload_tog_and_sheet and fetch_calls_n_push_to_sheets pipelines refactor

0.1.90

update: replace org-id with reference in s3 upload component.

0.1.89

fix: storage options.

0.1.88

fix: slack token const.
refactor: unused types.

0.1.87

fix: db host name.
update: default arguments fixed.

0.1.86

docs: Better examples, easier to copy paste commands.
add: slack notifications to all components.

0.1.85

fix: slack urls.
add: slack notifications to all components.

0.1.84

add: slack signing secret.

0.1.83

docs: Pipeline and payload docs added.
update: slackbot command parser responds with stacktrace.

0.1.82

fix: No module named 'skit_pipelines' caused by installing before copy source in dockerfile.
refactor: makefile for building pipelines is leaner.

0.1.80

add: slack-bot integration. Invoke pipelines via slackbot.
refactor: Automatic generation of pydantic models using kfp signature.

0.1.79

add: intent evaluation pipeline.
add: crr evaluation pipeline.
add: slack-bot integration.

0.1.62

update: remove helper function for upload2sheet
update: sheet duplication and row logic while pushing calls to google sheet

0.1.53

update: move helper functions of upload2sheet into a folder

0.1.52

update: skit-calls==0.2.15

0.1.48

update: skit-calls==0.2.14

0.1.45

fix: to make Docker file and github actions yaml to use google secrets

0.1.44

update: component to push CSV to a google sheet. Load google secrets from a Github secrets
update: Docker file and github actions yaml to use google secrets

0.1.43

add: component to push CSV to a google sheet.
add: pipeline to fetch calls and push to google sheet.

0.1.42

update: XLMR training supports lr parameter.

0.1.41

update: skit-calls==0.2.13

0.1.40

update: skit-labels==0.3.21
update: ping users on slack channels.

0.1.39

update: skit-calls==0.2.12
update: remove recurrent from fetch-n-tag pipeline and start_date, end_date logic moved to skit-calls instead.

0.1.38

update: skit-calls==0.2.11
update: skit-labels==0.3.20
fix: save labelencoder pickle while running the train xlmr pipeline.

0.1.32

update: remove dvc link for pipelines.
add: makefile downloads secrets from skit-calls.

0.1.26

feat: APIs for pipelines.
add: secrets via dvc.

0.1.25

add: pipeline to fetch and tag a dataset.
fix: support legacy dataframes missing utterance columns.

0.1.24

update: XLMR intent classifier training pipeline pushes models to s3.

0.1.23

update: trained models upload to s3.

0.1.22

add: Conda and Cuda within base docker image.

0.1.21

add: Cuda installation within Dockerfile.
fix: Intent classifier training component.

0.1.20

fix: intent classifier featurizer fixed to apply row-wise transformation.
fix: hacky solution for ThilinaRajapakse/simpletransformers#1386

0.1.19

update: kfp installed within container.

0.1.18

update: components isolated from helper functions.

0.1.17

add: component to create utterance column utterances.
add: component to create true intent column intent_y.
add: component to add state and utterances as features for intent classifer (xlmr).
update: model training pipeline with train set only.

0.1.16

add: torch = "^1.11.0" for cuda 10.2

0.1.15

add: preprocessing module for specialized components.

0.1.14

update: skit-labels 0.3.17 with higher tolerance for utterance structures.

0.1.13

update: skit-labels skit-calls for serialized json fields.

0.1.12

update: skit-labels 0.3.13, values for db creds resolved.

0.1.11

add: placeholder component to train xlmr intent classifier.

0.1.10

add: component that fetches tagged datasets.
add: kubeflow pipeline utilizing the above component.

0.1.9

update: skit-calls 0.2.3

0.1.8

fix: Slack notification component -- Slack token constant.

0.1.7

fix: Slack notification component -- Slack token constant.

0.1.6

update: link slack component with fetch data pipeline.

0.1.5

feat: slack integration.

0.1.4

update: skit-pipelines is available within docker image.

0.1.3

update: build pipeline yamls via make.

0.1.2

refactor: modularize project.

0.1.1

add: boto3 for s3 upload/download.

0.1.0

add: calls dataset component.
add: Upload to s3 component.
add: Calls dataset pipeline.
add: workflow to automate docker image creation on tag push.

Files

CHANGELOG.md

Latest commit

History