Skip to content Skip to sidebar Skip to footer

-find Top X By Count From Mysql In Python?

I have a csv file like this: nohaelprince@uwaterloo.ca, 01-05-2014 nohaelprince@uwaterloo.ca, 01-05-2014 nohaelprince@uwaterloo.ca, 01-05-2014 nohaelprince@gmail.com, 01-05-2014 I

Solution 1:

Your needed report can be done in SQL on the MySQL side and Python can be used to call the query, import the resultset, and print out the results.

Consider the following aggregate query with subquery and derived table which follow the percentage growth formula:

((this month domain total cnt) - (last month domain total cnt))
 / (last month all domains total cnt)

SQL

SELECT  domain_name, pct_growth
FROM (

SELECT t1.domain_name,  
         # SUM OF SPECIFIC DOMAIN'S CNT BETWEEN TODAY AND 30 DAYS AGO  
        (Sum(CASE WHEN t1.date_of_entry >= (CURRENT_DATE - INTERVAL 30 DAY) 
                  THEN t1.cnt ELSE 0 END)               
         -
         # SUM OF SPECIFIC DOMAIN'S CNT AS OF 30 DAYS AGO
         Sum(CASE WHEN t1.date_of_entry < (CURRENT_DATE - INTERVAL 30 DAY) 
                  THEN t1.cnt ELSE 0 END) 
        ) /   
        # SUM OF ALL DOMAINS' CNT AS OF 30 DAYS AGO
        (SELECT SUM(t2.cnt) FROM domains t2 
          WHERE t2.date_of_entry < (CURRENT_DATE - INTERVAL 30 DAY))
         As pct_growth   

FROM domains t1
GROUP BY t1.domain_name
) As derivedTable

ORDER BY pct_growth DESC
LIMIT 50;

Python

cur = db.cursor()
sql = "SELECT * FROM ..."  # SEE ABOVE 

cur.execute(sql)

for row in cur.fetchall():
   print(row)

Solution 2:

If I understand correctly, you just need the ratio of the past thirty days to the total count. You can get this using conditional aggregation. So, assuming that cnt is always greater than 0:

select d.domain_name,
       sum(cnt) as CntTotal,
       sum(case when date_of_entry >= date_sub(now(), interval 1 month) then cnt else 0 end) as Cnt30Days,
       (sum(case when date_of_entry >= date_sub(now(), interval 1 month) then cnt else 0 end) / sum(cnt)) as Ratio30Days
from domains d
group by d.domain_name
order by Ratio30Days desc;

Post a Comment for "-find Top X By Count From Mysql In Python?"