{"id":168,"date":"2016-12-08T21:16:33","date_gmt":"2016-12-08T21:16:33","guid":{"rendered":"http:\/\/python.wp.w3.pt\/?p=168"},"modified":"2016-12-08T21:19:43","modified_gmt":"2016-12-08T21:19:43","slug":"expressoes-regulares-iv","status":"publish","type":"post","link":"http:\/\/python.w3.pt\/?p=168","title":{"rendered":"Express\u00f5es regulares IV"},"content":{"rendered":"<p>A express\u00e3o regular mais trabalhosa, at\u00e9 agora, foi a seguinte, que serve para capturar men\u00e7\u00f5es, ou seja, nomes precedidos por @, mas que n\u00e3o s\u00e3o emails.<\/p>\n<p>O mais dif\u00edcil foi perceber que a express\u00e3o <code>(?&lt;=\\@)\\w+<\/code> n\u00e3o devolve o @ para o <em>negative lookbehind<\/em> anterior, e por isso \u00e9 que tive que usar o @ tamb\u00e9m no <em>negative lookbehind<\/em> <code>(?&lt;!\\w\\@)<\/code>.<\/p>\n<p>Eis o c\u00f3digo:<\/p>\n<pre>#!\/usr\/bin\/python\r\n# -*- coding: utf-8 -*-\r\nfrom __future__ import unicode_literals\r\nimport re\r\nimport collections\r\n\r\nline = \"@mary call @john or send him an email @ john2@gmail.com also tell @mary that I cannot go. Regards @est\u00eav\u00e3o\"\r\n\r\np = re.compile(ur'(?i)(?&lt;!\\w\\@)((?&lt;=\\@)\\w+)',re.U)\r\nr = p.findall(line)\r\ncnt = collections.Counter(r)\r\n\r\nprint cnt\r\n\r\nfor key, value in cnt.iteritems():\r\n\u00a0\u00a0 \u00a0print key, value\r\n<\/pre>\n<p>e o resultado<span data-offset-key=\"21i0s-0-0\"><span data-text=\"true\">, com contagem de ocorr\u00eancias<\/span><\/span>:<\/p>\n<pre>Counter({u'mary': 2, u'est\\xeav\\xe3o': 1, u'john': 1})\r\nest\u00c3\u00aav\u00c3\u00a3o 1\r\njohn 1\r\nmary 2\r\ngmail 1\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>A express\u00e3o regular mais trabalhosa, at\u00e9 agora, foi a seguinte, que serve para capturar men\u00e7\u00f5es, ou seja, nomes precedidos por @, mas que n\u00e3o s\u00e3o emails. O mais dif\u00edcil foi perceber que a express\u00e3o (?&lt;=\\@)\\w+ n\u00e3o devolve o @ para o negative lookbehind anterior, e por isso \u00e9 que tive que usar o @ tamb\u00e9m &hellip; <\/p>\n<p class=\"link-more\"><a href=\"http:\/\/python.w3.pt\/?p=168\" class=\"more-link\">Continuar a ler <span class=\"screen-reader-text\">&#8220;Express\u00f5es regulares IV&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"http:\/\/python.w3.pt\/index.php?rest_route=\/wp\/v2\/posts\/168"}],"collection":[{"href":"http:\/\/python.w3.pt\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/python.w3.pt\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/python.w3.pt\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/python.w3.pt\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=168"}],"version-history":[{"count":4,"href":"http:\/\/python.w3.pt\/index.php?rest_route=\/wp\/v2\/posts\/168\/revisions"}],"predecessor-version":[{"id":172,"href":"http:\/\/python.w3.pt\/index.php?rest_route=\/wp\/v2\/posts\/168\/revisions\/172"}],"wp:attachment":[{"href":"http:\/\/python.w3.pt\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=168"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/python.w3.pt\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=168"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/python.w3.pt\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=168"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}