Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AI-powered classification workflow #570

Open
3 of 10 tasks
JoelWiebe opened this issue Nov 22, 2023 · 2 comments
Open
3 of 10 tasks

Add AI-powered classification workflow #570

JoelWiebe opened this issue Nov 22, 2023 · 2 comments
Assignees
Labels
enhancement New feature or request high priority high priority task

Comments

@JoelWiebe
Copy link
Contributor

JoelWiebe commented Nov 22, 2023

Description

Create a new workflow type used to classify posts from a source bucket into a specified number of destination buckets.

Details
Create an additional workflow type that will take as input a workflow name, a source bucket (containing the posts to be classified), and the number of categories into which the posts can be organized. Classification will use Google's Gemini model through Vertex AI for classification. This workflow will automatically create a new bucket for each category, move the posts into the corresponding buckets (identified by the LLM), and modify the bucket view to contain only the newly created buckets with assigned posts.

Tasks
Screenshot 2023-11-22 at 11 56 05 AM

  • Add a new workflow type called "AI Classification" to the Create Workflows tab in the Manage Workflows tool
  • Add a Source field used to select from any bucket or the canvas
  • Add a # Categories (or Num Categories) field that can be a slider, selector, etc. with values from 2 to 10; as a default, let's set this to 4 #596
  • To begin, let's use Vertex AI with the gemini-1.5-flash-001 for classification
  • For authentication on the production server, we may need to (1) create a service account on our google cloud platform with permissions to access the Vertex AI models, (2) download the JSON key file, (3) store the key in the AZURE app service, then (4) load the credentials and initial the vertex AI client using those credentials
  • Create prompts to generate the specified number of classification categories and classify each of the posts into one of those specified categories
  • Verify that the number of classification categories provided by the LLM matches the requested number of categories, then use the given classification names to create a new bucket for each of the classification categories
  • Move the posts to the destination buckets according to assigned values; verify that each assigned value matches one of the bucket names created; you may also keep the posts in their original location when moving to the bucket
  • Run this workflow from the Manage Workflows tab and display any errors to the teacher as a snackbar message (e.g., # Categories Mismatch: x number of categories created, y number requested; Post Assignment Mismatch: x number of post assignments to categories outside of category list)
  • Update the bucket view to display the buckets that were just created
Screenshot 2023-11-22 at 12 24 57 PM
@JoelWiebe JoelWiebe added enhancement New feature or request high priority high priority task labels Nov 22, 2023
@JoelWiebe
Copy link
Contributor Author

JoelWiebe commented Nov 28, 2023

@LunarFang416 Here are two resources related to classification prompts:

  • An OpenAI example that suggests using JSON to structure the text snippets
  • A PaLM API example that suggests adjusting temperature and top-K to be more deterministic (in Firebase extension configurations in our case I believe)

The above assumes we already have categories. To generate our own categories, we could use a prompt such as the following:

_Analyze the provided text snippets and identify the four most prominent categories that effectively summarize the overall content. Each category should represent a distinct theme or topic that encompasses the key ideas presented in the snippets. Ensure that the categories are concise, informative, and accurately reflect the essence of the text snippets.

Text snippets:
[Insert your text snippets here]_

We could try displaying the text snippets as independent paragraphs or in JSON format. Also, in both cases, we could try classification with only the body text or with the title text followed by the body text to see if this makes much of a difference. Let me know if you want me to play around with this too! We can also run this in a classroom to get some real data as we design these prompts.

@JoelWiebe JoelWiebe moved this from Todo to Priority Todo in SCORE Meta-view Jun 20, 2024
@JoelWiebe JoelWiebe moved this from Priority Todo to In Progress in SCORE Meta-view Jul 9, 2024
@JoelWiebe
Copy link
Contributor Author

Using Teacher (AI Assistant) Agent instead

@JoelWiebe JoelWiebe moved this from In Progress to Done in SCORE Meta-view Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request high priority high priority task
Projects
Archived in project
Development

When branches are created from issues, their pull requests are automatically linked.

3 participants