Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add User Lifetime Statistics #309

Merged
merged 7 commits into from
Apr 14, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion backend/Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -45,4 +45,4 @@ typing-extensions = "*"
drf-excel = "*"

[requires]
python_version = "3"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure why the Pipfile and Pipfile.lock are changed. In general, if there are no dependency changes, we shouldn't touch these cause they're read directly from our production machines to install. Would suggest following this article to remove these changes from the PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to fix this with the stackoverflow article but it kind of does not look like it did anything per the last commit- I think I could use some help with this at GBM

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

python_version = "3"
2,352 changes: 1,227 additions & 1,125 deletions backend/Pipfile.lock

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions backend/ohq/admin.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
Question,
Queue,
QueueStatistic,
UserStatistic,
Semester,
Tag,
)
Expand All @@ -26,3 +27,4 @@
admin.site.register(QueueStatistic)
admin.site.register(Announcement)
admin.site.register(Tag)
admin.site.register(UserStatistic)
67 changes: 67 additions & 0 deletions backend/ohq/management/commands/user_stat.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
from django.core.management.base import BaseCommand
from django.utils import timezone
from datetime import timedelta
import logging

from ohq.models import Profile, Course, Question
from ohq.statistics import (
user_calculate_questions_asked,
user_calculate_questions_answered,
user_calculate_time_helped,
user_calculate_time_helping,
user_calculate_students_helped,
)

logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)


class Command(BaseCommand):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's talk about this file on Friday!

def add_arguments(self, parser):
parser.add_argument("--hist", action="store_true", help="Calculate all historic statistics")

def calculate_statistics(self, profiles, courses, earliest_date):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The earliest_date parameter here is used to compute a date field, which I don't think is quite necessary. I think we can directly just use the earliest_date field wherever date is used.

yesterday = timezone.datetime.today().date() - timezone.timedelta(days=1)
active_users_today = set()

for course in courses:
logger.debug("course here is", course)
if earliest_date:
date = earliest_date
else:
course_questions = Question.objects.filter(queue__course=course)
date = (
timezone.template_localtime(
course_questions.earliest("time_asked").time_asked
).date()
if course_questions
else yesterday
)

course_questions = Question.objects.filter(queue__course=course, time_asked__gte=date)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from django.db.models import Q

questions_queryset = Question.objects.filter(queue__course=course, time_asked__gte=date)
users_union = (
    User.objects.filter(
        Q(id__in=questions_queryset.values_list("asked_by", flat=True)) |
        Q(id__in=questions_queryset.values_list("responded_to_by", flat=True))
    )
)

for q in course_questions:
active_users_today.add(q.asked_by)
active_users_today.add(q.responded_to_by)

for profile in profiles:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be slow because we are iterating over all profiles, it is probably best to do this in the DB somehow (like the code snippet from above)

if profile.user in active_users_today:
user_calculate_questions_asked(profile.user)
user_calculate_questions_answered(profile.user)
user_calculate_time_helped(profile.user)
user_calculate_time_helping(profile.user)
user_calculate_students_helped(profile.user)


def handle(self, *args, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks a lot cleaner, good work!

if kwargs["hist"]:
courses = Course.objects.all()
profiles = Profile.objects.all()
earliest_date = None
else:
courses = Course.objects.filter(archived=False)
profiles = Profile.objects.all()
earliest_date = timezone.now().date() - timedelta(days=1)

self.calculate_statistics(profiles, courses, earliest_date)


29 changes: 29 additions & 0 deletions backend/ohq/migrations/0020_auto_20240326_0226.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Generated by Django 3.1.7 on 2024-03-26 02:26

from django.conf import settings
from django.db import migrations, models
import django.db.models.deletion


class Migration(migrations.Migration):

dependencies = [
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
('ohq', '0019_auto_20211114_1800'),
]

operations = [
migrations.CreateModel(
name='UserStatistic',
fields=[
('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
('metric', models.CharField(choices=[('TOTAL_QUESTIONS_ASKED', 'Total questions asked'), ('TOTAL_QUESTIONS_ANSWERED', 'Total questions answered'), ('TOTAL_TIME_BEING_HELPED', 'Total time being helped'), ('TOTAL_TIME_HELPING', 'Total time helping'), ('TOTAL_STUDENTS_HELPED', 'Total students helped')], max_length=256)),
('value', models.DecimalField(decimal_places=8, max_digits=16)),
('user', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to=settings.AUTH_USER_MODEL)),
],
),
migrations.AddConstraint(
model_name='userstatistic',
constraint=models.UniqueConstraint(fields=('user', 'metric'), name='unique_user_statistic'),
),
]
32 changes: 32 additions & 0 deletions backend/ohq/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -411,3 +411,35 @@ class Announcement(models.Model):
author = models.ForeignKey(User, related_name="announcements", on_delete=models.CASCADE)
time_updated = models.DateTimeField(auto_now=True)
course = models.ForeignKey(Course, related_name="announcements", on_delete=models.CASCADE)


class UserStatistic(models.Model):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

"""
Statistics related to a user (student or TA) across many courses
"""

METRIC_TOTAL_QUESTIONS_ASKED = "TOTAL_QUESTIONS_ASKED"
METRIC_TOTAL_QUESTIONS_ANSWERED = "TOTAL_QUESTIONS_ANSWERED"
METRIC_TOTAL_TIME_BEING_HELPED = "TOTAL_TIME_BEING_HELPED"
METRIC_TOTAL_TIME_HELPING = "TOTAL_TIME_HELPING"
METRIC_TOTAL_STUDENTS_HELPED = "TOTAL_STUDENTS_HELPED"

METRIC_CHOICES = [
(METRIC_TOTAL_QUESTIONS_ASKED, "Total questions asked"),
(METRIC_TOTAL_QUESTIONS_ANSWERED, "Total questions answered"),
(METRIC_TOTAL_TIME_BEING_HELPED, "Total time being helped"),
(METRIC_TOTAL_TIME_HELPING, "Total time helping"),
(METRIC_TOTAL_STUDENTS_HELPED, "Total students helped"),
]

user = models.ForeignKey(User, on_delete=models.CASCADE)
metric = models.CharField(max_length=256, choices=METRIC_CHOICES)
value = models.DecimalField(max_digits=16, decimal_places=8)

class Meta:
constraints = [
models.UniqueConstraint(fields=["user", "metric"], name="unique_user_statistic")
]



69 changes: 68 additions & 1 deletion backend/ohq/statistics.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,9 @@
from django.db.models import Avg, Case, Count, F, Sum, When
from django.db.models.functions import TruncDate
from django.utils import timezone
from decimal import Decimal, ROUND_HALF_UP

from ohq.models import CourseStatistic, Question, QueueStatistic
from ohq.models import CourseStatistic, Question, QueueStatistic, UserStatistic


User = get_user_model()
Expand Down Expand Up @@ -232,3 +233,69 @@ def queue_calculate_questions_per_ta_heatmap(queue, weekday, hour):
hour=hour,
defaults={"value": statistic if statistic else 0},
)


def user_calculate_questions_asked(user):
num_questions = Decimal(Question.objects.filter(asked_by=user).count())
UserStatistic.objects.update_or_create(
user=user,
metric=UserStatistic.METRIC_TOTAL_QUESTIONS_ASKED,
defaults={"value": num_questions},
)


def user_calculate_questions_answered(user):
num_questions = Decimal(Question.objects.filter(responded_to_by=user).count())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should add status=Question.STATUS_ANSWERED to make the query more accurate

UserStatistic.objects.update_or_create(
user=user,
metric=UserStatistic.METRIC_TOTAL_QUESTIONS_ANSWERED,
defaults={"value": num_questions},
)



def user_calculate_time_helped(user):
user_time_helped = (
Question.objects.filter(asked_by=user)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this query work if the question was rejected or withdrawn? In that case the time_response_started field would be null. If it doesn't, perhaps filter for where the question status is answered?

.aggregate(time_helped=Sum(F("time_responded_to") - F("time_response_started")))
)
time = user_time_helped["time_helped"]

UserStatistic.objects.update_or_create(
user=user,
metric=UserStatistic.METRIC_TOTAL_TIME_BEING_HELPED,
defaults={"value": time.seconds if time else Decimal('0.00000000')},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am more inclined to not call update_or_create at all if time.seconds is None instead of updating the value to be 0.

)


def user_calculate_time_helping(user):
user_time_helping = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto for comments regarding user_calculate_time_helped

Question.objects.filter(responded_to_by=user)
.aggregate(time_answering=Sum(F("time_responded_to") - F("time_response_started")))
)
time = user_time_helping["time_answering"]

UserStatistic.objects.update_or_create(
user=user,
metric=UserStatistic.METRIC_TOTAL_TIME_HELPING,
defaults={"value": time.seconds if time else Decimal('0.00000000')},
)


def user_calculate_students_helped(user):
num_students = Decimal(
Question.objects.filter(
status=Question.STATUS_ANSWERED,
responded_to_by=user
)
.distinct("asked_by")
.count()
)
UserStatistic.objects.update_or_create(
user=user,
metric=UserStatistic.METRIC_TOTAL_STUDENTS_HELPED,
defaults={"value": num_students},
)



27 changes: 27 additions & 0 deletions backend/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 5 additions & 0 deletions backend/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

"devDependencies": {
"@types/node": "^20.11.30"
}
}
Loading
Loading