వాడుకరి:Mpradeepbot/ProjectStatistics.py

సూచనలు

క్రింది ప్రోగ్రామును వివిధ ప్రాజెక్టుల ద్వారా నిర్వహిస్తూ, అభివృధి పరుస్తున్న వ్యాసాల గణాంకాలను సేకరించడానికి ఉపయోగించవచ్చు. వ్యాసాల చర్చా పేజీలలో చేర్చిన ప్రాజెక్టు మూసల ద్వారా వ్యాసాలను మొదటగా వర్గీకరించవచ్చు. అలా వర్గీకరించిన తరువాత, వివిధ వర్గాలలో చేరిన వ్యాసాలను ఈ బాటు ప్రోగ్రాము ద్వారా లెక్కించవచ్చు. లెక్కించిన వెంటనే, ఆ గణాంకాలను వికీపీడియాలోని ఒక పేజీలోకి ఎక్కించవచ్చును కూడా...

ఈ బాటు ప్రోగ్రామును నడపటానికి రెండు వేర్వేరు ఫైళ్ల ద్వారా కొంత సమాచారాన్ని అందించాలి.

ProjectTemplates.txt - ఈ పైలులో గణాంకాలను ఏఏ వర్గాల నుండీ సేకరించాలో తెలుపాలి. ఒక్కో లైనులో ఒక్కో వర్గాన్ని ఉంచాలి.
ProjectTemplateBase.txt - ఈ ఫైలులో మొదటి లైనులో గణాంకాలను చేర్చాల్సిన పేజీని తెలుపాలి. ఆ తరువాత లైన్లలో గణాంకాలను ఏ విధంగా చూపించాలో తెలుపాలి.

ఈ ప్రోగ్రాము పై రెండు ఫైళ్లలో ఉన్న సమాచారం, ముందుగానే నిర్దేశించిన ఒక పద్దతిలో ఉందని భావిస్తూ ఉంటుంది; అందుకని ఏదయినా ప్రాజెక్టుపై ఈ బాటును ఉపయోగించే ముందు, ఉదాహరణగా కొన్ని ప్రాజెక్టులకు తయారుచేసిన ఫైళ్లను గమనించండి.

భారతదేశం ప్రాజెక్టు కోసం ProjectTemplates.txtగా ఈ పైలును, ProjectTemplateBase.txtగా ఈ పైలును వాడండి.
జీవ శాస్త్రం ప్రాజెక్టు కోసం ProjectTemplates.txtగా ఈ పైలును, ProjectTemplateBase.txtగా ఈ పైలును వాడండి.
ఆంధ్రప్రదేశ్ ప్రాజెక్టు కోసం ProjectTemplates.txtగా ఈ పైలును, ProjectTemplateBase.txtగా ఈ పైలును వాడండి.
హిందూమతం ప్రాజెక్టు కోసం ProjectTemplates.txtగా ఈ పైలును, ProjectTemplateBase.txtగా ఈ పైలును వాడండి.

ప్రోగ్రాము

import wikipedia, time, catlib, codecs

####################################################################################################
# This function returns the list of articles as a list object
# in given category.  Please give only the Category Name,
# namespace will be addd automatically.
# --function requires both 'wikipedia' and 'catlib' to be imported
def getArticleList(catTitle):
    cat = catlib.Category(wikipedia.getSite(), u'Category:'+catTitle)
    listOfArticles = cat.articlesList()
    return listOfArticles
####################################################################################################


####################################################################################################
# Replace the contents in the page 'pageTitle' with data 'pageData' 
# and add the comment 'comment'
def writeData(pageTitle, pageData, comment):
  page = wikipedia.Page(wikipedia.getSite(), pageTitle)
  try:
    # Load the page's text from the wiki
    data = page.get()
  except wikipedia.NoPage:
    data = u''
  data = pageData
  try:
    page.put(data, comment = comment)
  except wikipedia.EditConflict:
    wikipedia.output(u'Skipping %s because of edit conflict' % (page.title()))
  except wikipedia.SpamfilterError, url:
    wikipedia.output(u'Cannot change %s because of blacklist entry %s' % (page.title(), url))
####################################################################################################


####################################################################################################
# Calculates the project statistics for the given input templates and the projects
# and then updates the statistics on to the wikipedia.
def calculateProjectStatistics(ProjectTemplates, ProjectTemplateBase):
  # open all the required files
  logfile = codecs.open('mpc.ProjectStatistics.log', encoding='utf-8', mode='wb')
  templateFile = open(ProjectTemplates, 'rb' )
  templateBase = open(ProjectTemplateBase, 'rb' )

  #omit 3 characters if it is UTF-8
  #comment the below lines if you are not using notepad to save telugu text
  templateFile.read(3)
  templateBase.read(3)

  templateTitle = u"" + unicode(templateBase.readline(), 'utf8')
  templateLine = u""
  newText = u""

  i=0
  sum=0
  extra = 0
  readNext = 1
  grandTotal = 0
  counts = [0, 0, 0, 0, 0]

  for line in templateFile:
      line = unicode(line, 'utf8')
      line = line.replace(u'\n', u'')
      line = line.replace(u'\r', u'')

      catList = line.split('$$')
      articleCount = 0
      for categoryName in catList:
         aList = []
         aList = getArticleList(categoryName)
         articleCount = articleCount + len(aList)

      if i>=30:
         extra = articleCount
         continue
      else:
         counts[i%5] += articleCount

      sum += articleCount
      i = i + 1

      while readNext == 1:
         logfile.write(templateLine)
         newText += templateLine
         templateLine = u"" + unicode(templateBase.readline(), 'utf8')
         if templateLine.find('$$') != -1:
            readNext = 0

      templateLine = templateLine.replace('$$', str(articleCount), 1)

      if i%5 == 0:
         templateLine = templateLine.replace('$$', str(sum), 1)
         sum = 0

      if templateLine.find('$$') == -1:
         readNext = 1

  while readNext == 1:
     logfile.write(templateLine)
     newText += templateLine
     templateLine = u"" + unicode(templateBase.readline(), 'utf8')
     if templateLine.find('$$') != -1:
        readNext = 0

  templateLine = templateLine.replace('$$', str(extra), 1)
  if templateLine.find('$$') == -1:
     readNext = 1

  # write the column sums
  for sum in counts:
     while readNext == 1:
        logfile.write(templateLine)
        newText += templateLine
        templateLine = u"" + unicode(templateBase.readline(), 'utf8')
        if templateLine.find('$$') != -1:
           readNext = 0
   
     templateLine = templateLine.replace('$$', str(sum), 1)
     grandTotal += sum
     if templateLine.find('$$') == -1:
        readNext = 1

  while readNext == 1:
     logfile.write(templateLine)
     newText += templateLine
     templateLine = u"" + unicode(templateBase.readline(), 'utf8')
     if templateLine.find('$$') != -1:
        readNext = 0

  # write the grand total
  templateLine = templateLine.replace('$$', str(grandTotal + extra), 1)

  logfile.write(templateLine)
  newText += templateLine
  for line in templateBase:
     logfile.write(line)
     newText += line

  writeData(templateTitle, newText, u"Bot updating at time: " + time.strftime("[%a, %d %b %Y %H:%M:%S]  ", time.localtime(time.time())))

  templateBase.close()
  templateFile.close()
  logfile.close()
####################################################################################################


####################################################################################################
calculateProjectStatistics('IndiaTemplates.txt', 'IndiaTemplateBase.txt')
calculateProjectStatistics('BIOTemplates.txt', 'BIOTemplateBase.txt')
calculateProjectStatistics('APTemplates.txt', 'APTemplateBase.txt')
calculateProjectStatistics('HRTemplates.txt', 'HRTemplateBase.txt')