{"id":150,"date":"2016-12-07T17:23:07","date_gmt":"2016-12-07T17:23:07","guid":{"rendered":"http:\/\/python.wp.w3.pt\/?p=150"},"modified":"2016-12-08T00:46:47","modified_gmt":"2016-12-08T00:46:47","slug":"expressoes-regulares","status":"publish","type":"post","link":"http:\/\/python.w3.pt\/?p=150","title":{"rendered":"Express\u00f5es regulares"},"content":{"rendered":"<p>Para al\u00e9m das express\u00f5es regulares que eu usava com o FLEX, e tamb\u00e9m depois, com o PERL e o PHP, o Python tem umas express\u00f5es novas, que me parecem ter sido criadas pelo pr\u00f3prio Python, e que s\u00e3o bastante interessantes.<\/p>\n<p><code class=\"docutils literal\"><span class=\"pre\">*?<\/span><\/code>, <code class=\"docutils literal\"><span class=\"pre\">+?<\/span><\/code>, <code class=\"docutils literal\"><span class=\"pre\">??<\/span><\/code><br \/>\nCria uma vers\u00e3o n\u00e3o gananciosa dos quantificadores <code class=\"docutils literal\"><span class=\"pre\">*<\/span><\/code>, <code class=\"docutils literal\"><span class=\"pre\">+<\/span><\/code>, <code class=\"docutils literal\"><span class=\"pre\">?<\/span><\/code>. Neste caso, vai ser encontrada a express\u00e3o m\u00ednima, ao inv\u00e9s da expres\u00e3o m\u00e1xima.<\/p>\n<p><code class=\"docutils literal\"><span class=\"pre\">{m,n}?<\/span><\/code><br \/>\nDa mesma forma que no caso anterior, aqui vai ser encontrada a express\u00e3o m\u00ednima de <code class=\"docutils literal\"><span class=\"pre\">{m,n}<\/span><\/code>. Por exemplo, para a string de 6 carateres <code class=\"docutils literal\"><span class=\"pre\">'aaaaaa'<\/span><\/code>, <code class=\"docutils literal\"><span class=\"pre\">a{3,5}<\/span><\/code> emparelha 5 carateres <code class=\"docutils literal\"><span class=\"pre\">'a'<\/span><\/code>, enquanto que <code class=\"docutils literal\"><span class=\"pre\">a{3,5}?<\/span><\/code> emparelha apenas 3 carateres.<\/p>\n<p><code class=\"docutils literal\"><span class=\"pre\">(?...)<\/span><\/code><br \/>\n\u00c9 uma nota\u00e7\u00e3o de extens\u00e3o. O car\u00e1ter ap\u00f3s o <code class=\"docutils literal\"><span class=\"pre\">?<\/span><\/code> define o resto da sintaxe. Ver abaixo.<\/p>\n<p><code class=\"docutils literal\"><span class=\"pre\">(?iLmsux)<\/span><\/code><br \/>\nAs letras <code class=\"docutils literal\"><span class=\"pre\">'i'<\/span><\/code>, <code class=\"docutils literal\"><span class=\"pre\">'L'<\/span><\/code>, <code class=\"docutils literal\"><span class=\"pre\">'m'<\/span><\/code>, <code class=\"docutils literal\"><span class=\"pre\">'s'<\/span><\/code>, <code class=\"docutils literal\"><span class=\"pre\">'u'<\/span><\/code>, <code class=\"docutils literal\"><span class=\"pre\">'x'<\/span><\/code> correpondem \u00e0s flags <a class=\"reference internal\" title=\"re.I\" href=\"https:\/\/docs.python.org\/2\/library\/re.html#re.I\"><code class=\"xref py py-const docutils literal\"><span class=\"pre\">re.I<\/span><\/code><\/a> (ignore case), <a class=\"reference internal\" title=\"re.L\" href=\"https:\/\/docs.python.org\/2\/library\/re.html#re.L\"><code class=\"xref py py-const docutils literal\"><span class=\"pre\">re.L<\/span><\/code><\/a> (locale dependent), <a class=\"reference internal\" title=\"re.M\" href=\"https:\/\/docs.python.org\/2\/library\/re.html#re.M\"><code class=\"xref py py-const docutils literal\"><span class=\"pre\">re.M<\/span><\/code><\/a> (multi-line), <a class=\"reference internal\" title=\"re.S\" href=\"https:\/\/docs.python.org\/2\/library\/re.html#re.S\"><code class=\"xref py py-const docutils literal\"><span class=\"pre\">re.S<\/span><\/code><\/a> (dot matches all), <a class=\"reference internal\" title=\"re.U\" href=\"https:\/\/docs.python.org\/2\/library\/re.html#re.U\"><code class=\"xref py py-const docutils literal\"><span class=\"pre\">re.U<\/span><\/code><\/a> (Unicode dependent), and <a class=\"reference internal\" title=\"re.X\" href=\"https:\/\/docs.python.org\/2\/library\/re.html#re.X\"><code class=\"xref py py-const docutils literal\"><span class=\"pre\">re.X<\/span><\/code><\/a> (verbose), para toda a express\u00e3o regular. Alternativamente, tamb\u00e9m se pode passar a flag na fun\u00e7\u00e3o <a class=\"reference internal\" title=\"re.compile\" href=\"https:\/\/docs.python.org\/2\/library\/re.html#re.compile\"><code class=\"xref py py-func docutils literal\"><span class=\"pre\">re.compile()<\/span><\/code><\/a>.<\/p>\n<p><code class=\"docutils literal\"><span class=\"pre\">(?:...)<\/span><\/code><\/p>\n<p>Captura uma express\u00e3o, mas ignora-a no grupo de express\u00f5es capturadas<br \/>\n<code class=\"docutils literal\"><span class=\"pre\">(?P&lt;name&gt;...)<\/span><\/code><\/p>\n<p>Atribui um nome ao objeto capturado. Substitui o nome <em>group<\/em>. Na mesma express\u00e3o, o nome <em>name<\/em> \u00e9 \u00fanico. Por exemplo, <code class=\"docutils literal\"><span class=\"pre\">(?P&lt;quote&gt;['\"]).*?(?P=quote)<\/span><\/code> captura qualquer string delimitada por aspas ou plicas.<\/p>\n<p><code class=\"docutils literal\"><span class=\"pre\">(?P=name)<\/span><\/code><br \/>\nEquivalente ao anterior.<\/p>\n<p><code class=\"docutils literal\"><span class=\"pre\">(?#...)<\/span><\/code><br \/>\nComent\u00e1rio. Ignorado.<\/p>\n<p><code class=\"docutils literal\"><span class=\"pre\">(?#...)<\/span><\/code><br \/>\nEmparelha, se <code class=\"docutils literal\"><span class=\"pre\">...<\/span><\/code> emparelhar o pr\u00f3ximo, mas n\u00e3o consome a string. \u00c9 chamada uma <em>lookahead assertion<\/em>. Por exemplo, <code class=\"docutils literal\"><span class=\"pre\">Isaac<\/span> <span class=\"pre\">(?=Asimov)<\/span><\/code> emparelha <code class=\"docutils literal\"><span class=\"pre\">'Isaac<\/span> <span class=\"pre\">'<\/span><\/code> apenas se seguido de <code class=\"docutils literal\"><span class=\"pre\">'Asimov'<\/span><\/code>.<\/p>\n<p><code class=\"docutils literal\"><span class=\"pre\">(?!...)<\/span><\/code><br \/>\nEmparelha se n\u00e3o emparelhar o pr\u00f3ximo. \u00c9 uma <em>negative lookahead assertion<\/em>. Por exemplo, <code class=\"docutils literal\"><span class=\"pre\">Isaac<\/span> <span class=\"pre\">(?!Asimov)<\/span><\/code> emparelha <code class=\"docutils literal\"><span class=\"pre\">'Isaac<\/span> <span class=\"pre\">'<\/span><\/code> apenas se n\u00e3o for seguido de <code class=\"docutils literal\"><span class=\"pre\">'Asimov'<\/span><\/code>.<\/p>\n<p><code class=\"docutils literal\"><span class=\"pre\">(?&lt;=...)<\/span><\/code><br \/>\nEmparelha se a posi\u00e7\u00e3o atual na string for precedida por um emparelhamento de <code class=\"docutils literal\"><span class=\"pre\">...<\/span><\/code> que termine na posi\u00e7\u00e3o atual. \u00c9 denominada uma <em class=\"dfn\">positive lookbehind assertion<\/em>. <code class=\"docutils literal\"><span class=\"pre\">(?&lt;=abc)def<\/span><\/code> emparelha com <code class=\"docutils literal\"><span class=\"pre\">abcdef<\/span><\/code>, uma vez que o <em>lookbehind<\/em> volta atr\u00e1s 3 carateres e verifica se o padr\u00e3o mencionado emparelha.<\/p>\n<pre>&gt;&gt;&gt; import re\r\n&gt;&gt;&gt; m = re.search('(?&lt;=abc)def', 'abcdef') &gt;&gt;&gt; m.group(0)\r\n'def'\r\n<\/pre>\n<p>Outro exemplo, que procura uma palavra que segue um h\u00edfen.<\/p>\n<pre>&gt;&gt;&gt; m = re.search('(?&lt;=-)\\w+', 'spam-egg') &gt;&gt;&gt; m.group(0)\r\n'egg'\r\n<\/pre>\n<p><code class=\"docutils literal\"><span class=\"pre\">(?&lt;!...)<\/span><\/code><\/p>\n<p>Emparelha, se a posi\u00e7\u00e3o atual na string n\u00e3o for precedida por um <code class=\"docutils literal\"><span class=\"pre\">...<\/span><\/code>. \u00c9 chamado um <em class=\"dfn\">negative lookbehind assertion<\/em>. \u00c9 o oposto do anterior.<\/p>\n<p><code class=\"docutils literal\"><span class=\"pre\">(?(id\/name)yes-pattern|no-pattern)<\/span><\/code><\/p>\n<p class=\"first\">Tenta emparelhar com <code class=\"docutils literal\"><span class=\"pre\">yes-pattern<\/span><\/code> se o grupo com o <em>id<\/em> ou o <em>name<\/em> existir, e com <code class=\"docutils literal\"><span class=\"pre\">no-pattern<\/span><\/code> se n\u00e3o existir. <code class=\"docutils literal\"><span class=\"pre\">no-pattern<\/span><\/code> \u00e9 opcional e pode ser omitido. Por exemplo, <code class=\"docutils literal\"><span class=\"pre\">(&lt;)?(\\w+@\\w+(?:\\.\\w+)+)(?(1)&gt;)<\/span><\/code> \u00e9 um padr\u00e3o fraco para emparelhar emails, que emparelha <code class=\"docutils literal\"><span class=\"pre\">'&lt;user@host.com&gt;'<\/span><\/code> assim como <code class=\"docutils literal\"><span class=\"pre\">'user@host.com'<\/span><\/code>, masn n\u00e3o <code class=\"docutils literal\"><span class=\"pre\">'&lt;user@host.com'<\/span><\/code>.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Para al\u00e9m das express\u00f5es regulares que eu usava com o FLEX, e tamb\u00e9m depois, com o PERL e o PHP, o Python tem umas express\u00f5es novas, que me parecem ter sido criadas pelo pr\u00f3prio Python, e que s\u00e3o bastante interessantes. *?, +?, ?? Cria uma vers\u00e3o n\u00e3o gananciosa dos quantificadores *, +, ?. Neste caso, &hellip; <\/p>\n<p class=\"link-more\"><a href=\"http:\/\/python.w3.pt\/?p=150\" class=\"more-link\">Continuar a ler <span class=\"screen-reader-text\">&#8220;Express\u00f5es regulares&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"http:\/\/python.w3.pt\/index.php?rest_route=\/wp\/v2\/posts\/150"}],"collection":[{"href":"http:\/\/python.w3.pt\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/python.w3.pt\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/python.w3.pt\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/python.w3.pt\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=150"}],"version-history":[{"count":6,"href":"http:\/\/python.w3.pt\/index.php?rest_route=\/wp\/v2\/posts\/150\/revisions"}],"predecessor-version":[{"id":156,"href":"http:\/\/python.w3.pt\/index.php?rest_route=\/wp\/v2\/posts\/150\/revisions\/156"}],"wp:attachment":[{"href":"http:\/\/python.w3.pt\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=150"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/python.w3.pt\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=150"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/python.w3.pt\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=150"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}