Remove trailing and leading spaces and extra internal whitespace with one gsub call

I know you can remove trailing and leading spaces with

gsub("^\\s+|\\s+$", "", x)

And you can remove internal spaces with

gsub("\\s+"," ",x)

I can combine these into one function, but I was wondering if there was a way to do it with just one use of the gsub function

trim <- function (x) { x <- gsub("^\\s+|\\s+$|", "", x) gsub("\\s+", " ", x) } testString<- " This is a test. " trim(testString)

--------------Solutions-------------

Here is an option:

gsub("^ +| +$|( ) +", "\\1", testString) # with Frank's input, and Agstudy's style

We use a capturing group to make sure that multiple internal spaces are replaced by a single space. Change " " to \\s if you expect non-space whitespace you want to remove.

Using a positive lookbehind :

gsub("^ *|(?<= ) | *$",'',testString,perl=TRUE)
# "This is a test."

Explanation :

## "^ *" matches any leading space
## "(?<= ) " The general form is (?<=a)b :
## matches a "b"( a space here)
## that is preceded by "a" (another space here)
## " *$" matches trailing spaces

You can just add \\s+(?=\\s) to your original regex:

gsub("^\\s+|\\s+$|\\s+(?=\\s)", "", x, perl=T)

See DEMO

You've asked for a gsub option and gotten good options. There's also rm_white_multiple from "qdapRegex":

> testString<- " This is a test. "
> library(qdapRegex)
> rm_white_multiple(testString)
[1] "This is a test."

If an answer not using gsub is acceptable then the following does it. It does not use any regular expressions:

paste(scan(textConnection(testString), what = "", quiet = TRUE), collapse = " ")

giving:

[1] "This is a test."

You can also use nested gsub. Less elegant than the previous answers tho

> gsub("\\s+"," ",gsub("^\\s+|\\s$","",testString))
[1] "This is a test."

Category:regex Time:2018-11-15 Views:0
Tags: regex

Related post

Copyright (C) pcaskme.com, All Rights Reserved.

processed in 0.520 (s). 13 q(s)