Last updated:
0 purchases
PipeTK 0.2.5.1
Pipe Tool Kit=============This tool kit is an attempt to compile a set of tools I've created to solveproblems for La Quadrature du Net. Those tools are designed to works with unixshell pipe. They follow the unix philosophy and aim to stay simple.I'm going to try to keep them at least a bit documented but since I'm always inan hurry I can't guaranteed this will always be the case.You can have an idea of the power of this kind of problems solving approachhere: http://blog.worlddomination.be/projects/ungarage.html (this tuto doesn'tfeatures those tools for the moment).For organisation reason all the commands starts with the letter "p".There is place for a lots of ameliorations, most of those tools are verysimplistic and could be generalised (for example pdetinyfy or purltitle couldbe apply to each urls of a sentence, purls could use regex and could begeneralised for anything by accepting regex as an argument).It's possible I'm reinventing the wheel for some of those commands. Don'thesitate to tell me if it's the case.Current tools=============* pipetk: list the available commands with the informations bellow.* puniq: eliminate duplications in real time via a shell unix pipe.(| sort | uniq waits for the full stream to be completed and also sort it, therefor it, for example, won't works between a rsstail|feedstail and an ii file). Example: :: feedstail−uhttp://reddit.com/.rss|puniq∗pmerge:openanamedpipeandwritetostdouteverythingwritteninthenamedpipe.Thefirstargumentgivenisthenameofthenamedpipe.Itdoesn′tmanagedinanywaythepossibleconflictsbetweenmultipleprocessthatwriteonthatpipeatthesametime.Sillyexample::: pmerge PIPE > irc.freenode.net/#laquadrature & echo "pouet" > PIPE* purls: really simple url extracting tools. It split each line it receives by white space then display on a new line each words that starts with either "http" or "https". Example: :: echo"thereis2urlsinthissentence:thisonehttp://blog.worlddomination.beandthisonehttp://laquadrature.net"|purlshttp://blog.worlddomination.behttp://laquadrature.net∗pdetinyfy:gettherealurlofashortenedurl.FIXED:worksonurlsinsideastringExample::: echo "foo http://ur1.ca/4110r bar" | pdetinyfy foo http://laquadrature.net bar* purltitle: get an url in input and output the url followed by it's title. FIXED: works on urls inside a string Example: :: éecho"foohttp://laquadrature.netbar"|purltitlefoohttp://laquadrature.netLaQuadratureduNet|InternetetLibertésbar∗plag:slowdownthedisplayingofastreambysleepingagivetimebeetweneachlineCanhavethenumberofsecondsthesleepasarg(acceptfloatvalue).Example:::dmseg|plagor:::dmesg|plag60∗puniqrt:trytoavoidduplicationsofsimilartweets.ForexamplebyremovingRTtweetsofatweetthathasalreadybeendisplayed.Behaveinthesamewaythatpuniq.∗premoveurls:removetheurlsfromastring.ThisismoreanexamplescriptfortheURLPipeTemplateclassthansomethingreallyusefull.Example::: echo "foo http://laquadrature.net bar" | premoveurls "foo bar"* pcleanurls: clean urls by removing useless informations like tracking stuff like "?utm_*" args added to urls.* ptweetlen: return the len a string will have on tweeter with it's urls tinyfied by the t.do domainCoding new pipe utils=====================The PipeToolKit comes with 2 template python Class for coding new pipeutilities for your need. Here are 2 simple example on how to use each ones. Idon't think that you'll need to now anything more. If you want to, just readthe code, it's not long.PipeTemplate------------This is the standard template. :: from pipetk import PipeTemplate class Example(PipeTemplate): # Options: displayed here with their default value, you can change it by redefining it # FAIL_ON_EXCEPTION = False # define if the pipe will stop when an exception is raised # DISPLAY_ERROR = True # define if the exception backtrace and message are displayed # RETRY = False # define if the pipe must retry it's processing on an exception # MAX_RETRY = 3 # define the number of time the pipe must retry to process it's input on an exception # UNMODIFIED_TO_STDOUT_ON_FAIL = False # define if on an exception the unmodified text must be written # WITH_ENDL = True # define if the endl char must be sended to the process function def process(self, line): # called everytime a line is written on stdin, you must implement it # VERY IMPORTANT: process must return something iterable, either a # list or by being a decorator (by using yield). This allow you to # return severals different things. # do you stuff yield line # or return [a, b, c, d] if __name__ == "__main__": Example().run()URLPipeTemplate---------------This is a template to work on every urls of a stream. :: from pipetk import URLPipeTemplate class Example(URLPipeTemplate): # Inherite from all the options of the PipeTemplate # Other option: # WITH_EXTRA_SPACE=False # define if the space that may follow the url in the string is send to the processing function # CAREFULL: this is process_URL, not process, you can't implement # process since it's already implemented to build this new template. def process_url(self, url): # called on every url encoutered # you must return a string return "" if __name__ == "__main__": Example().run()More example?-------------Just read the code of the existing tools. Most of it are very simple.Changelog=========0.2---* pdetinyfy now works for urls inside a string* new script: puniqrt to try to eliminate duplications for tweets* new template to build pipes utils that works on the urls of a string* add premoveurls as en example script for the new template* new script: pcleanurls to remove useless tracking pieces of urls (like utm_* stuff)* various bug fixs* add doc on how to write new pipe utils0.1---* InitLicence=======All those tools are released under the `GNU General Public License v3`_ or later... _GNU General Public License v3 : http://www.gnu.org/licenses/gpl-3.0.htmlFeedback========For any feedback you can contact me at <cortex at worlddomination dot be>.Laurent Peuch
For personal and professional use. You cannot resell or redistribute these repositories in their original state.
There are no reviews.