-find Top X By Count From Mysql In Python?
I have a csv file like this: nohaelprince@uwaterloo.ca, 01-05-2014 nohaelprince@uwaterloo.ca, 01-05-2014 nohaelprince@uwaterloo.ca, 01-05-2014 nohaelprince@gmail.com, 01-05-2014 I
Solution 1:
Your needed report can be done in SQL on the MySQL side and Python can be used to call the query, import the resultset, and print out the results.
Consider the following aggregate query with subquery and derived table which follow the percentage growth formula:
((this month domain total cnt) - (last month domain total cnt))
/ (last month all domains total cnt)
SQL
SELECT domain_name, pct_growth
FROM (
SELECT t1.domain_name,
# SUM OF SPECIFIC DOMAIN'S CNT BETWEEN TODAY AND 30 DAYS AGO
(Sum(CASE WHEN t1.date_of_entry >= (CURRENT_DATE - INTERVAL 30 DAY)
THEN t1.cnt ELSE 0 END)
-
# SUM OF SPECIFIC DOMAIN'S CNT AS OF 30 DAYS AGO
Sum(CASE WHEN t1.date_of_entry < (CURRENT_DATE - INTERVAL 30 DAY)
THEN t1.cnt ELSE 0 END)
) /
# SUM OF ALL DOMAINS' CNT AS OF 30 DAYS AGO
(SELECT SUM(t2.cnt) FROM domains t2
WHERE t2.date_of_entry < (CURRENT_DATE - INTERVAL 30 DAY))
As pct_growth
FROM domains t1
GROUP BY t1.domain_name
) As derivedTable
ORDER BY pct_growth DESC
LIMIT 50;
Python
cur = db.cursor()
sql = "SELECT * FROM ..." # SEE ABOVE
cur.execute(sql)
for row in cur.fetchall():
print(row)
Solution 2:
If I understand correctly, you just need the ratio of the past thirty days to the total count. You can get this using conditional aggregation. So, assuming that cnt
is always greater than 0
:
select d.domain_name,
sum(cnt) as CntTotal,
sum(case when date_of_entry >= date_sub(now(), interval 1 month) then cnt else 0 end) as Cnt30Days,
(sum(case when date_of_entry >= date_sub(now(), interval 1 month) then cnt else 0 end) / sum(cnt)) as Ratio30Days
from domains d
group by d.domain_name
order by Ratio30Days desc;
Post a Comment for "-find Top X By Count From Mysql In Python?"