Regex pattern using w.* not matching text starting with foreign characters such as Ä

I have the following regex that I have been using successfully:

preg_match_all('/(\d+)\n(\w.*)\n(\d{3}\.\d{3}\.\d{2})\n(\d.*)\n(\d.*)/', $text, $matches)

However I have just found that if the text that the (\w.*) part matches starts with a foreign character such as Ä, then it doesn't match anything.

Can anyone help me with what the correct pattern should be instead of (\w.*) to match a string that starts with any character?

Many thanks

--------------Solutions-------------

If you do want to match umlauts, then add the regex /u modifier, or use \pL in place of \w. That will allow the regex to match letters outside of the ASCII range.

Reference: http://www.regular-expressions.info/unicode.html
and http://php.net/manual/en/regexp.reference.unicode.php

Ä is a German Umlaut if I am not mistaken. \w Matches (in most flavors) [a-zA-Z0-9_].

You will need to match the unicode range of characters that you want.

\x{00C4} (php) equals the character you want. You will probably need to create a character class to support your unicode characters.

you may have to switch to using unicode chars...

like for ascii you would use [\u0021-\u007e] In this case... the maybe [\u0021-\u007e\u0192-\u687]

I'm not quite sure on what range of characters you want but the \w I think only match things in the normal asci range

Consider using:

/(\d+)\n((\p{L}|\p{N}|_).*)\n(\d{3}\.\d{3}\.\d{2})\n(\d.*)\n(\d.*)/

Category:php Time:2011-11-15 Views:0

Related post

  • REGEX Pattern - How do I match upto a certain tag in html 2012-01-05

    I have some html which I want to grab between 2 tags. However nested tags exist in the html so looking for wouldn't work as it would return on the first nested div. Basically I want my regex to.. Match some text literally, followed by ANY character u

  • Regex to match text longer than x characters between html tags? 2012-04-08

    I have the task of migrating THE worst HTML product descriptions you will ever encounter. It consists of a mixture of tables and paragraphs. The majority are not even 100% valid HTML and there are plenty of Microsoft tags courtesy of MS Word. It is l

  • Regex pattern that does not match certain extensions? 2012-04-07

    I have this pattern written ^.*\.(?!jpg$|png$).+$ However there is a problem - this pattern matches file.name.jpg (2 dots) It works correctly (does not match) on filename.jpg. I am trying to figure out how to make it not match ANY .jpg files even if

  • Match text only with word characters 2010-05-31

    so I want to check if all characters in text are \w or \s if yes then true if even one character isn't \w or \s then false. How can it be done?:) Language: javascript --------------Solutions------------- You want to match against this regex: ^[\w\s]*

  • How do I use regex in JavaScript to capture the text between two particular characters? 2012-01-30

    In the example below I am trying to capture the text between the two asterixes. var str="The *rain in SPAIN* stays mainly in the plain"; var patt1=/\*...\*/; console.log(str.match(patt1)); I'm trying to follow the example here http://www.regular-expr

  • Regex, match text inside a tag, then match all text not in that tag both from same string? 2011-10-17

    I suck at Regex's and am surprised I was able to get as far as I did by myself. So far I've got this: string text = "Whoa here is some very cool text.<phone>222-222-5555</phone><timesCalled>6</timescalled>"; Regex phoneRegex =

  • Regex - Match text between 2 characters 2010-06-02

    I need a regex pattern(s) that will match on words before a colon and also values between 2 characters. Here's the example, I have a string: str='`width: 1070px; padding: 0px 10px 0px 10px; height: auto; margin:0px auto 0px auto;`' from a stylesheet,

  • Regex in java question, multiple matches 2009-01-21

    I am trying to match multiple CSS style code blocks in a HTML document. This code will match the first one but won't match the second. What code would I need to match the second. Can I just get a list of the groups that are inside of my 'style' brack

  • How do I match text in HTML that's not inside tags? 2009-02-22

    Given a string like this: <a href="http://blah.com/foo/blah">This is the foo link</a> ... and a search string like "foo", I would like to highlight all occurrences of "foo" in the text of the HTML -- but not inside a tag. In other words,

  • Should this regex pattern throw an exception? 2011-06-03

    Should this regex pattern throw an exception? Does for me. ^\d{3}[a-z] The error is: parsing "^\d{3}[a" - Unterminated [] set. I feel dumb. I don't get the error. (My RegexBuddy seems okay with it.) A little more context which I hope doesn't cloud th

  • jQuery: How can I match text with a RegEx pattern and wrap the results in an anchor tag? 2010-09-25

    I have a bunch of tweets that are returned as plain text that I would like to go through and assign proper links tags to based on RegEx matches. As an example here is a tweet where I would like @Bundlehunt to become <a href="http://twitter.com/bun

  • Match regex pattern when not inside a set of quotes (text spans multiple lines) 2012-04-05

    This is a continuation of my previous question .NET regex engine returns no matches but I am expecting 8. My query is handling everything perfectly and I have my capture groups working great, however I have found a edge case that I do not know how to

  • Is Regex pattern a simple text or a rule? 2009-02-27

    Is there any relatively simple way to recognize Regex pattern as simple text or as a rule? One example. @"[A-Z0-9]" - is a rule, and @"\\[A-Z0-9\\]" is a plain simple text (C# string syntax) --------------Solutions------------- Short of detecting ' [

  • Java - how to match regex Pattern containing single quotes? 2009-06-07

    [EDITED - really sorry, the code I quoted was wrong - have changed the message below to reflect this. Apologies! Thank you for your patience.] I'm new to regular expressions and want to match a pattern in Java (following on from this solution - http:

  • Regex pattern to return text from within parenthesis 2009-10-12

    I am looking for a regex pattern that will return me the contents of the first set of parenthesis in a string. For example, text text text text (hello) text (hello2) (hello3) text will return "hello" Does anyone know what the pattern looks like for c

  • Regex help: My regex pattern will match invalid strings 2010-01-01

    i really like Regex, unfortantly Im not that good at it yet. So therfore I hope you guys can help me out. The text string I want to validate consists of what I call "segments". A single segment might look like this: [A-Z,S,3] So far I managed to buil

  • Detect words or any character after some match pattern, Regex Pattern (Vim) 2010-09-26

    I have a text file patterned like this : 1 textA == this is textA == 1.1 textB === this is textB === 2 textC == this is textC == 2.1 textD === this is textD === 2.1.1 textE ==== this is textE ==== Whats the right regex pattern to formatting the text

  • How to use a REGEX pattern to remove a specific word "THE" only if at beginning of text string? 2010-10-16

    I have a text input field for titles of various things and to help minimize false negatives on search results(internal search is not the best), I need to have a REGEX pattern which looks at the first four characters of the input string and removes th

  • How to make a Regex Pattern for HTML Simple Text? 2010-12-10

    I am trying to learn Regex patterns for a class. I am making a simple HTML Lexer/Parser. I know this is not the best or most efficient way to make a Lexer/Parser but it is only to understand Regex patterns. So my question is, How do I create a patter

Copyright (C) pcaskme.com, All Rights Reserved.

processed in 0.366 (s). 13 q(s)