Data Platform for text-based mental health support

Crisis Text Line
Project manager on the customer side
Alexey Artemov
Data platform Tech-lead
Project timeline
December, 2022 - September, 2023
Project scope
1050 man-hours
Implement Data platform, which can support/offer:
  • Single point of truth
  • Batch processing
  • Real-time data processing
  • Machine Learning pipelines/life cycle
  • LLM
  • ML models serving

Project Results
  • Installed the data platform components
  • Established ETL pipelines (ingest & transformation)
  • Established SDLC (software development life cycle)
  • Automated CI/CD
  • Established monitoring (ETL, data freshness, notifications)
  • The whole infrastructure as a code
  • Onboarded DS & R&I teams

The uniqueness of the project

The Data platform should meet the following criteria:

  • Support batch and stream processing.
  • Support ML pipelines.
  • Be deployed in different countries (US, EU, Canada).
  • Support multi-language for data anonymization.
  • Monitor that data from one country cannot be uploaded or stored in an instance from another (e.g. EU data in US instance).
  • Has deployment flexibility (independent feature deployment to different instances/countries).
  • IaC (infrastructure as a code)

Used software
The following technologies have been used:
  • AWS (infrastructure, AWS DMS, S3, etc)
  • Apache Spark (Data processing framework)
  • Github
  • Terraform & Terragrunt for IaC

Difficulty of implementation
  • From the beginning, we decided to use CDC (change data capture). However, we encountered troubleshooting issues with AWS DMS and also assessed the effort required to implement incremental ingestions and transformations. As a result, we decided to initially go with batch ingestion, pulling data several times per day. This approach was deemed sufficient from a business perspective.
  • The initial infrastructure was deployed without IaC (Infrastructure as Code), which led to issues when adjustments needed to be made and when multi-country support was anticipated. Consequently, we implemented IaC to address these concerns.
  • Our CI/CD process is implemented using different technologies. Some parts are automated using GitHub actions, while others rely on running terraform/terragrunt scripts. In the future, we aim to unify and simplify our process.

Project Description

At the core of our company's mission lie empathy and innovation. Our primary objective is to enhance mental well-being for individuals worldwide, transcending geographical barriers.

To realize this mission, we recognize the need to harness modern technologies, including LLM models. Our Data Platform's central focus is transitioning from our existing legacy system to a more robust, dynamic, and industry-grade infrastructure. This transition will empower us with cutting-edge capabilities, enabling us to deliver unparalleled mental health support. Some key enhancements include:

  • Improving message classification (risk level) to ensure prompt assignment of Volunteer Crisis Counselors to critical cases.
  • Real-time monitoring of demand spikes, facilitating rapid outreach to Volunteer Crisis Counselors to address increased needs.
  • Developing realistic simulations through the utilization of LLM models
    #fff; --tw-ring-color: rgba(69,89,164,0.5); --tw-ring-offset-shadow: 0 0 transparent; --tw-ring-shadow: 0 0 transparent; --tw-shadow: 0 0 transparent; --tw-shadow-colored: 0 0 transparent; --tw-blur: ; --tw-brightness: ; --tw-contrast: ; --tw-grayscale: ; --tw-hue-rotate: ; --tw-invert: ; --tw-saturate: ; --tw-sepia: ; --tw-drop-shadow: ; --tw-backdrop-blur: ; --tw-backdrop-brightness: ; --tw-backdrop-contrast: ; --tw-backdrop-grayscale: ; --tw-backdrop-hue-rotate: ; --tw-backdrop-invert: ; --tw-backdrop-opacity: ; --tw-backdrop-saturate: ; --tw-backdrop-sepia: ; list-style: none; margin: 1.25em 0px; padding: 0px; counter-reset: list-number 0; display: flex; flex-direction: column; caret-color: #374151; color: #374151; font-family: Söhne, ui-sans-serif, system-ui, -apple-system, "Segoe UI", Roboto, Ubuntu, Cantarell, "Noto Sans", sans-serif, "Helvetica Neue", Arial, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Noto Color Emoji"; white-space: pre-wrap;">

With these initiatives, we aim to make a significant positive impact on mental well-being, regardless of where individuals are located.

Project geography
