Separate title string with no spaces into words





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







31















I want to find and separate words in a title that has no spaces.



Before:




ThisIsAnExampleTitleHELLO-WORLD2019T.E.S.T.(Test)"Test"'Test'[Test]




After:




This Is An Example Title HELLO-WORLD 2019 T.E.S.T. (Test) [Test] "Test" 'Test'






I'm looking for a regular expression rule that can do the following.



I thought I'd identify each word if it starts with an uppercase letter.



But also preserve all uppercase words as not to space them into A L L U P P E R C A S E.



Additional rules:




  • Space a letter if it touches a number: Hello2019World Hello 2019 World

  • Ignore spacing initials that contain periods, hyphens, or underscores T.E.S.T.

  • Ignore spacing if between brackets, parentheses, or quotes [Test] (Test) "Test" 'Test'

  • Preserve hyphens Hello-World




C#



https://rextester.com/GAZJS38767



// Title without spaces
string title = "ThisIsAnExampleTitleHELLO-WORLD2019T.E.S.T.(Test)[Test]"Test"'Test'";

// Detect where to space words
string split = Regex.Split(title, "(?<!^)(?=(?<![.\-'"([{])[A-Z][\d+]?)");

// Trim each word of extra spaces before joining
split = (from e in split
select e.Trim()).ToArray();

// Join into new title
string newtitle = string.Join(" ", split);

// Display
Console.WriteLine(newtitle);




Regular expression



I'm having trouble with spacing before the numbers, brackets, parentheses, and quotes.



https://regex101.com/r/9IIYGX/1



(?<!^)(?=(?<![.-'"([{])(?<![A-Z])[A-Z][d+?]?)

(?<!^) // Negative look behind

(?= // Positive look ahead

(?<![.-'"([{]) // Ignore if starts with punctuation
(?<![A-Z]) // Ignore if starts with double Uppercase letter
[A-Z] // Space after each Uppercase letter
[d+]? // Space after number

)




Solution



Thanks for all your combined effort in answers. Here's a Regex example. I'm applying this to file names and have exclude special characters /:*?"<>|.



https://rextester.com/FYEVE73725



https://regex101.com/r/xi8L4z/1










share|improve this question




















  • 10





    I am up-voting because its the first post i have seen in hours that has an appropriate amount of information, research and effort

    – Michael Randall
    Mar 11 at 6:02








  • 2





    @MichaelRandall And sadly, that is a better track record than what I see coming on the site during most weekend days.

    – Tim Biegeleisen
    Mar 11 at 6:04


















31















I want to find and separate words in a title that has no spaces.



Before:




ThisIsAnExampleTitleHELLO-WORLD2019T.E.S.T.(Test)"Test"'Test'[Test]




After:




This Is An Example Title HELLO-WORLD 2019 T.E.S.T. (Test) [Test] "Test" 'Test'






I'm looking for a regular expression rule that can do the following.



I thought I'd identify each word if it starts with an uppercase letter.



But also preserve all uppercase words as not to space them into A L L U P P E R C A S E.



Additional rules:




  • Space a letter if it touches a number: Hello2019World Hello 2019 World

  • Ignore spacing initials that contain periods, hyphens, or underscores T.E.S.T.

  • Ignore spacing if between brackets, parentheses, or quotes [Test] (Test) "Test" 'Test'

  • Preserve hyphens Hello-World




C#



https://rextester.com/GAZJS38767



// Title without spaces
string title = "ThisIsAnExampleTitleHELLO-WORLD2019T.E.S.T.(Test)[Test]"Test"'Test'";

// Detect where to space words
string split = Regex.Split(title, "(?<!^)(?=(?<![.\-'"([{])[A-Z][\d+]?)");

// Trim each word of extra spaces before joining
split = (from e in split
select e.Trim()).ToArray();

// Join into new title
string newtitle = string.Join(" ", split);

// Display
Console.WriteLine(newtitle);




Regular expression



I'm having trouble with spacing before the numbers, brackets, parentheses, and quotes.



https://regex101.com/r/9IIYGX/1



(?<!^)(?=(?<![.-'"([{])(?<![A-Z])[A-Z][d+?]?)

(?<!^) // Negative look behind

(?= // Positive look ahead

(?<![.-'"([{]) // Ignore if starts with punctuation
(?<![A-Z]) // Ignore if starts with double Uppercase letter
[A-Z] // Space after each Uppercase letter
[d+]? // Space after number

)




Solution



Thanks for all your combined effort in answers. Here's a Regex example. I'm applying this to file names and have exclude special characters /:*?"<>|.



https://rextester.com/FYEVE73725



https://regex101.com/r/xi8L4z/1










share|improve this question




















  • 10





    I am up-voting because its the first post i have seen in hours that has an appropriate amount of information, research and effort

    – Michael Randall
    Mar 11 at 6:02








  • 2





    @MichaelRandall And sadly, that is a better track record than what I see coming on the site during most weekend days.

    – Tim Biegeleisen
    Mar 11 at 6:04














31












31








31


3






I want to find and separate words in a title that has no spaces.



Before:




ThisIsAnExampleTitleHELLO-WORLD2019T.E.S.T.(Test)"Test"'Test'[Test]




After:




This Is An Example Title HELLO-WORLD 2019 T.E.S.T. (Test) [Test] "Test" 'Test'






I'm looking for a regular expression rule that can do the following.



I thought I'd identify each word if it starts with an uppercase letter.



But also preserve all uppercase words as not to space them into A L L U P P E R C A S E.



Additional rules:




  • Space a letter if it touches a number: Hello2019World Hello 2019 World

  • Ignore spacing initials that contain periods, hyphens, or underscores T.E.S.T.

  • Ignore spacing if between brackets, parentheses, or quotes [Test] (Test) "Test" 'Test'

  • Preserve hyphens Hello-World




C#



https://rextester.com/GAZJS38767



// Title without spaces
string title = "ThisIsAnExampleTitleHELLO-WORLD2019T.E.S.T.(Test)[Test]"Test"'Test'";

// Detect where to space words
string split = Regex.Split(title, "(?<!^)(?=(?<![.\-'"([{])[A-Z][\d+]?)");

// Trim each word of extra spaces before joining
split = (from e in split
select e.Trim()).ToArray();

// Join into new title
string newtitle = string.Join(" ", split);

// Display
Console.WriteLine(newtitle);




Regular expression



I'm having trouble with spacing before the numbers, brackets, parentheses, and quotes.



https://regex101.com/r/9IIYGX/1



(?<!^)(?=(?<![.-'"([{])(?<![A-Z])[A-Z][d+?]?)

(?<!^) // Negative look behind

(?= // Positive look ahead

(?<![.-'"([{]) // Ignore if starts with punctuation
(?<![A-Z]) // Ignore if starts with double Uppercase letter
[A-Z] // Space after each Uppercase letter
[d+]? // Space after number

)




Solution



Thanks for all your combined effort in answers. Here's a Regex example. I'm applying this to file names and have exclude special characters /:*?"<>|.



https://rextester.com/FYEVE73725



https://regex101.com/r/xi8L4z/1










share|improve this question
















I want to find and separate words in a title that has no spaces.



Before:




ThisIsAnExampleTitleHELLO-WORLD2019T.E.S.T.(Test)"Test"'Test'[Test]




After:




This Is An Example Title HELLO-WORLD 2019 T.E.S.T. (Test) [Test] "Test" 'Test'






I'm looking for a regular expression rule that can do the following.



I thought I'd identify each word if it starts with an uppercase letter.



But also preserve all uppercase words as not to space them into A L L U P P E R C A S E.



Additional rules:




  • Space a letter if it touches a number: Hello2019World Hello 2019 World

  • Ignore spacing initials that contain periods, hyphens, or underscores T.E.S.T.

  • Ignore spacing if between brackets, parentheses, or quotes [Test] (Test) "Test" 'Test'

  • Preserve hyphens Hello-World




C#



https://rextester.com/GAZJS38767



// Title without spaces
string title = "ThisIsAnExampleTitleHELLO-WORLD2019T.E.S.T.(Test)[Test]"Test"'Test'";

// Detect where to space words
string split = Regex.Split(title, "(?<!^)(?=(?<![.\-'"([{])[A-Z][\d+]?)");

// Trim each word of extra spaces before joining
split = (from e in split
select e.Trim()).ToArray();

// Join into new title
string newtitle = string.Join(" ", split);

// Display
Console.WriteLine(newtitle);




Regular expression



I'm having trouble with spacing before the numbers, brackets, parentheses, and quotes.



https://regex101.com/r/9IIYGX/1



(?<!^)(?=(?<![.-'"([{])(?<![A-Z])[A-Z][d+?]?)

(?<!^) // Negative look behind

(?= // Positive look ahead

(?<![.-'"([{]) // Ignore if starts with punctuation
(?<![A-Z]) // Ignore if starts with double Uppercase letter
[A-Z] // Space after each Uppercase letter
[d+]? // Space after number

)




Solution



Thanks for all your combined effort in answers. Here's a Regex example. I'm applying this to file names and have exclude special characters /:*?"<>|.



https://rextester.com/FYEVE73725



https://regex101.com/r/xi8L4z/1







c# regex






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 12 at 1:57







Matt McManis

















asked Mar 11 at 5:55









Matt McManisMatt McManis

1,62511133




1,62511133








  • 10





    I am up-voting because its the first post i have seen in hours that has an appropriate amount of information, research and effort

    – Michael Randall
    Mar 11 at 6:02








  • 2





    @MichaelRandall And sadly, that is a better track record than what I see coming on the site during most weekend days.

    – Tim Biegeleisen
    Mar 11 at 6:04














  • 10





    I am up-voting because its the first post i have seen in hours that has an appropriate amount of information, research and effort

    – Michael Randall
    Mar 11 at 6:02








  • 2





    @MichaelRandall And sadly, that is a better track record than what I see coming on the site during most weekend days.

    – Tim Biegeleisen
    Mar 11 at 6:04








10




10





I am up-voting because its the first post i have seen in hours that has an appropriate amount of information, research and effort

– Michael Randall
Mar 11 at 6:02







I am up-voting because its the first post i have seen in hours that has an appropriate amount of information, research and effort

– Michael Randall
Mar 11 at 6:02






2




2





@MichaelRandall And sadly, that is a better track record than what I see coming on the site during most weekend days.

– Tim Biegeleisen
Mar 11 at 6:04





@MichaelRandall And sadly, that is a better track record than what I see coming on the site during most weekend days.

– Tim Biegeleisen
Mar 11 at 6:04












4 Answers
4






active

oldest

votes


















8














First few parts are similar to @revo answer: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}, additionally I add the following regex to space between number and letter: (?<=[a-z])(?=d)|(?<=d)(?=[a-z])|(?<=[A-Z])(?=d)|(?<=d)(?=[A-Z]) and to detect OTPIsADevice then replace with lookahead and lookbehind to find uppercase with a lowercase: (((?<!^)[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))



Note that | is or operator which allowed all the regex to be executed.



Regex: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=[a-z])(?=d)|(?<=d)(?=[a-z])|(?<=[A-Z])(?=d)|(?<=d)(?=[A-Z])|(((?<!^)[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))



Demo



Update



Improvised a bit:



From: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=[a-z])(?=d)|(?<=d)(?=[a-z])|(?<=[A-Z])(?=d)|(?<=d)(?=[A-Z])



into: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=p{L})d which do the same thing.



(((?<!^)(?<!p{P})[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))|(?<!^)(?=[[({&])|(?<=[)]}!&}]) improvised from OP comment which is adding exception to some punctuation: (((?<!^)(?<!['([{])[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))|(?<!^)(?=[[({&])|(?<=[)\]}!&}])



Final regex:
(?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=p{L})d|(((?<!^)(?<!p{P})[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))|(?<!^)(?=[[({&])|(?<=[)]}!&}])



Demo






share|improve this answer


























  • This is almost working perfect. One issue, somewhere in the last part |(((?<!^)[A-Z](?=[a-z]))|((?<=[a-z])[A-Z])) is not preserving the parentheses, brackets, and quotes. rextester.com/BTA83734

    – Matt McManis
    Mar 11 at 20:48











  • Thanks, your regex has solved the single letter problem. I've added some extra rules at the end to handle the other issues. rextester.com/FYEVE73725

    – Matt McManis
    Mar 12 at 1:53



















18














Here is a regex which seems to work well, at least for your sample input:



(?<=[a-z])(?=[A-Z])|(?<=[0-9])(?=[A-Za-z])|(?<=[A-Za-z])(?=[0-9])|(?<=W)(?=W)


This patten says to make a split on a boundary of one of the following conditions:




  • what precedes is a lowercase, and what precedes is an uppercase (or
    vice-versa)

  • what precedes is a digit and what follows is a letter (or
    vice-versa)

  • what precedes and what follows is a non word character
    (e.g. quote, parenthesis, etc.)





string title = "ThisIsAnExampleTitleHELLO-WORLD2019T.E.S.T.(Test)[Test]"Test"'Test'";
string split = Regex.Split(title, "(?<=[a-z])(?=[A-Z])|(?<=[0-9])(?=[A-Za-z])|(?<=[A-Za-z])(?=[0-9])|(?<=\W)(?=\W)");
split = (from e in split select e.Trim()).ToArray();
string newtitle = string.Join(" ", split);

This Is An Example Title HELLO-WORLD 2019 T.E.S.T. (Test) [Test] "Test" 'Test'


Note: You might also want to add this assertion to the regex alternation:



(?<=W)(?=w)|(?<=w)(?=W)


We got away with this here, because this boundary condition never happened. But you might need it with other inputs.






share|improve this answer


























  • I ran into one issue, when it comes to single letter words like A and I, it will not separate because it uses the ALL UPPERCASE rule (two uppercase next to each other). ATitleExample becomes ATitle Example.

    – Matt McManis
    Mar 11 at 7:35






  • 1





    @MattMcManis This is an edge case which will potentially break all of the answers given here. You would need to do more work to cover such cses.a

    – Tim Biegeleisen
    Mar 11 at 7:36











  • Maybe I can run the output of this through a second regex to fix those.

    – Matt McManis
    Mar 11 at 7:38



















9














Aiming for simplicity rather than huge regex, I would recommend this code with small simple patterns (comments with explanation are in code):



string str = "ThisIsAnExampleTitleHELLO-WORLD2019T.E.S.T.(Test)"Test"'Test'[Test]";
// insert space when there is small letter followed by upercase letter
str = Regex.Replace(str, "(?<=[a-z])(?=[A-Z])", " ");
// insert space whenever there's digit followed by a ltter
str = Regex.Replace(str, @"(?<=d)(?=[A-Za-z])", " ");
// insert space when there's letter followed by digit
str = Regex.Replace(str, @"(?<=[A-Za-z])(?=d)", " ");
// insert space when there's one of characters ("'[ followed by letter or digit
str = Regex.Replace(str, @"(?=[([""'][a-zA-Z0-9])", " ");
// insert space when what preceeds is on of characters ])"'
str = Regex.Replace(str, @"(?<=[)]""'])", " ");





share|improve this answer
























  • If commenting was your main concern you could enable x-mode or use inline comments i.e. (?#insert space when there's letter followed by digit).

    – revo
    Mar 11 at 7:45






  • 2





    @revo I used standard C# comments :) I think it's more readable.

    – Michał Turczyn
    Mar 11 at 7:46






  • 2





    You could also write such kind of readable comments by setting standard x modifier which enables you to write multiline, indented perfect comments. It's not simple by the way. Just split. .

    – revo
    Mar 11 at 7:49





















7














You could reduce the requirements to shorten the steps of a regular expression using a different interpretation of them. For example, the first requirement would be the same as to say, preserve capital letters if they are not preceded by punctuation marks or capital letters.



The following regex works almost for all of the mentioned requirements and may be extended to include or exclude other situations:



(?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}


You have to use Replace() method and use $0 as substitution string.



See live demo here



.NET (See it in action):



string input = @"ThisIsAnExample.TitleHELLO-WORLD2019T.E.S.T.(Test)""Test""'Test'[Test]";
Regex regex = new Regex(@"(?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}", RegexOptions.Multiline);
Console.WriteLine(regex.Replace(input, @" $0"));





share|improve this answer


























  • This is an interesting way. Which rule can be added to fix HELLO-WORLD2019 by spacing the 2019?

    – Matt McManis
    Mar 11 at 7:10






  • 1





    Add (?<=p{L})d within an alternation: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=p{L})d.

    – revo
    Mar 11 at 7:17











  • I have one other issue, single letter words like A and I won't space. ATitleExample becomes ATitle Example.

    – Matt McManis
    Mar 11 at 8:23











  • What about something like OTPIsADevice?

    – revo
    Mar 11 at 8:29











  • It starts to get complicated. OTPIs ADevice maybe I can run the output through a second filter. Rules: If a word starts with 2 Uppercase letters ADevice, add a space after the first letter A Device. And if an ALL UPPERCASE word ends in a lowercase letter OTPIs, add a space before the last two letters OTP Is.

    – Matt McManis
    Mar 11 at 8:54














Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55095949%2fseparate-title-string-with-no-spaces-into-words%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























4 Answers
4






active

oldest

votes








4 Answers
4






active

oldest

votes









active

oldest

votes






active

oldest

votes









8














First few parts are similar to @revo answer: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}, additionally I add the following regex to space between number and letter: (?<=[a-z])(?=d)|(?<=d)(?=[a-z])|(?<=[A-Z])(?=d)|(?<=d)(?=[A-Z]) and to detect OTPIsADevice then replace with lookahead and lookbehind to find uppercase with a lowercase: (((?<!^)[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))



Note that | is or operator which allowed all the regex to be executed.



Regex: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=[a-z])(?=d)|(?<=d)(?=[a-z])|(?<=[A-Z])(?=d)|(?<=d)(?=[A-Z])|(((?<!^)[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))



Demo



Update



Improvised a bit:



From: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=[a-z])(?=d)|(?<=d)(?=[a-z])|(?<=[A-Z])(?=d)|(?<=d)(?=[A-Z])



into: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=p{L})d which do the same thing.



(((?<!^)(?<!p{P})[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))|(?<!^)(?=[[({&])|(?<=[)]}!&}]) improvised from OP comment which is adding exception to some punctuation: (((?<!^)(?<!['([{])[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))|(?<!^)(?=[[({&])|(?<=[)\]}!&}])



Final regex:
(?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=p{L})d|(((?<!^)(?<!p{P})[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))|(?<!^)(?=[[({&])|(?<=[)]}!&}])



Demo






share|improve this answer


























  • This is almost working perfect. One issue, somewhere in the last part |(((?<!^)[A-Z](?=[a-z]))|((?<=[a-z])[A-Z])) is not preserving the parentheses, brackets, and quotes. rextester.com/BTA83734

    – Matt McManis
    Mar 11 at 20:48











  • Thanks, your regex has solved the single letter problem. I've added some extra rules at the end to handle the other issues. rextester.com/FYEVE73725

    – Matt McManis
    Mar 12 at 1:53
















8














First few parts are similar to @revo answer: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}, additionally I add the following regex to space between number and letter: (?<=[a-z])(?=d)|(?<=d)(?=[a-z])|(?<=[A-Z])(?=d)|(?<=d)(?=[A-Z]) and to detect OTPIsADevice then replace with lookahead and lookbehind to find uppercase with a lowercase: (((?<!^)[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))



Note that | is or operator which allowed all the regex to be executed.



Regex: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=[a-z])(?=d)|(?<=d)(?=[a-z])|(?<=[A-Z])(?=d)|(?<=d)(?=[A-Z])|(((?<!^)[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))



Demo



Update



Improvised a bit:



From: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=[a-z])(?=d)|(?<=d)(?=[a-z])|(?<=[A-Z])(?=d)|(?<=d)(?=[A-Z])



into: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=p{L})d which do the same thing.



(((?<!^)(?<!p{P})[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))|(?<!^)(?=[[({&])|(?<=[)]}!&}]) improvised from OP comment which is adding exception to some punctuation: (((?<!^)(?<!['([{])[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))|(?<!^)(?=[[({&])|(?<=[)\]}!&}])



Final regex:
(?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=p{L})d|(((?<!^)(?<!p{P})[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))|(?<!^)(?=[[({&])|(?<=[)]}!&}])



Demo






share|improve this answer


























  • This is almost working perfect. One issue, somewhere in the last part |(((?<!^)[A-Z](?=[a-z]))|((?<=[a-z])[A-Z])) is not preserving the parentheses, brackets, and quotes. rextester.com/BTA83734

    – Matt McManis
    Mar 11 at 20:48











  • Thanks, your regex has solved the single letter problem. I've added some extra rules at the end to handle the other issues. rextester.com/FYEVE73725

    – Matt McManis
    Mar 12 at 1:53














8












8








8







First few parts are similar to @revo answer: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}, additionally I add the following regex to space between number and letter: (?<=[a-z])(?=d)|(?<=d)(?=[a-z])|(?<=[A-Z])(?=d)|(?<=d)(?=[A-Z]) and to detect OTPIsADevice then replace with lookahead and lookbehind to find uppercase with a lowercase: (((?<!^)[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))



Note that | is or operator which allowed all the regex to be executed.



Regex: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=[a-z])(?=d)|(?<=d)(?=[a-z])|(?<=[A-Z])(?=d)|(?<=d)(?=[A-Z])|(((?<!^)[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))



Demo



Update



Improvised a bit:



From: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=[a-z])(?=d)|(?<=d)(?=[a-z])|(?<=[A-Z])(?=d)|(?<=d)(?=[A-Z])



into: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=p{L})d which do the same thing.



(((?<!^)(?<!p{P})[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))|(?<!^)(?=[[({&])|(?<=[)]}!&}]) improvised from OP comment which is adding exception to some punctuation: (((?<!^)(?<!['([{])[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))|(?<!^)(?=[[({&])|(?<=[)\]}!&}])



Final regex:
(?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=p{L})d|(((?<!^)(?<!p{P})[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))|(?<!^)(?=[[({&])|(?<=[)]}!&}])



Demo






share|improve this answer















First few parts are similar to @revo answer: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}, additionally I add the following regex to space between number and letter: (?<=[a-z])(?=d)|(?<=d)(?=[a-z])|(?<=[A-Z])(?=d)|(?<=d)(?=[A-Z]) and to detect OTPIsADevice then replace with lookahead and lookbehind to find uppercase with a lowercase: (((?<!^)[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))



Note that | is or operator which allowed all the regex to be executed.



Regex: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=[a-z])(?=d)|(?<=d)(?=[a-z])|(?<=[A-Z])(?=d)|(?<=d)(?=[A-Z])|(((?<!^)[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))



Demo



Update



Improvised a bit:



From: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=[a-z])(?=d)|(?<=d)(?=[a-z])|(?<=[A-Z])(?=d)|(?<=d)(?=[A-Z])



into: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=p{L})d which do the same thing.



(((?<!^)(?<!p{P})[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))|(?<!^)(?=[[({&])|(?<=[)]}!&}]) improvised from OP comment which is adding exception to some punctuation: (((?<!^)(?<!['([{])[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))|(?<!^)(?=[[({&])|(?<=[)\]}!&}])



Final regex:
(?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=p{L})d|(((?<!^)(?<!p{P})[A-Z](?=[a-z]))|((?<=[a-z])[A-Z]))|(?<!^)(?=[[({&])|(?<=[)]}!&}])



Demo







share|improve this answer














share|improve this answer



share|improve this answer








edited Mar 13 at 10:31

























answered Mar 11 at 10:26









MukyuuMukyuu

2,12131125




2,12131125













  • This is almost working perfect. One issue, somewhere in the last part |(((?<!^)[A-Z](?=[a-z]))|((?<=[a-z])[A-Z])) is not preserving the parentheses, brackets, and quotes. rextester.com/BTA83734

    – Matt McManis
    Mar 11 at 20:48











  • Thanks, your regex has solved the single letter problem. I've added some extra rules at the end to handle the other issues. rextester.com/FYEVE73725

    – Matt McManis
    Mar 12 at 1:53



















  • This is almost working perfect. One issue, somewhere in the last part |(((?<!^)[A-Z](?=[a-z]))|((?<=[a-z])[A-Z])) is not preserving the parentheses, brackets, and quotes. rextester.com/BTA83734

    – Matt McManis
    Mar 11 at 20:48











  • Thanks, your regex has solved the single letter problem. I've added some extra rules at the end to handle the other issues. rextester.com/FYEVE73725

    – Matt McManis
    Mar 12 at 1:53

















This is almost working perfect. One issue, somewhere in the last part |(((?<!^)[A-Z](?=[a-z]))|((?<=[a-z])[A-Z])) is not preserving the parentheses, brackets, and quotes. rextester.com/BTA83734

– Matt McManis
Mar 11 at 20:48





This is almost working perfect. One issue, somewhere in the last part |(((?<!^)[A-Z](?=[a-z]))|((?<=[a-z])[A-Z])) is not preserving the parentheses, brackets, and quotes. rextester.com/BTA83734

– Matt McManis
Mar 11 at 20:48













Thanks, your regex has solved the single letter problem. I've added some extra rules at the end to handle the other issues. rextester.com/FYEVE73725

– Matt McManis
Mar 12 at 1:53





Thanks, your regex has solved the single letter problem. I've added some extra rules at the end to handle the other issues. rextester.com/FYEVE73725

– Matt McManis
Mar 12 at 1:53













18














Here is a regex which seems to work well, at least for your sample input:



(?<=[a-z])(?=[A-Z])|(?<=[0-9])(?=[A-Za-z])|(?<=[A-Za-z])(?=[0-9])|(?<=W)(?=W)


This patten says to make a split on a boundary of one of the following conditions:




  • what precedes is a lowercase, and what precedes is an uppercase (or
    vice-versa)

  • what precedes is a digit and what follows is a letter (or
    vice-versa)

  • what precedes and what follows is a non word character
    (e.g. quote, parenthesis, etc.)





string title = "ThisIsAnExampleTitleHELLO-WORLD2019T.E.S.T.(Test)[Test]"Test"'Test'";
string split = Regex.Split(title, "(?<=[a-z])(?=[A-Z])|(?<=[0-9])(?=[A-Za-z])|(?<=[A-Za-z])(?=[0-9])|(?<=\W)(?=\W)");
split = (from e in split select e.Trim()).ToArray();
string newtitle = string.Join(" ", split);

This Is An Example Title HELLO-WORLD 2019 T.E.S.T. (Test) [Test] "Test" 'Test'


Note: You might also want to add this assertion to the regex alternation:



(?<=W)(?=w)|(?<=w)(?=W)


We got away with this here, because this boundary condition never happened. But you might need it with other inputs.






share|improve this answer


























  • I ran into one issue, when it comes to single letter words like A and I, it will not separate because it uses the ALL UPPERCASE rule (two uppercase next to each other). ATitleExample becomes ATitle Example.

    – Matt McManis
    Mar 11 at 7:35






  • 1





    @MattMcManis This is an edge case which will potentially break all of the answers given here. You would need to do more work to cover such cses.a

    – Tim Biegeleisen
    Mar 11 at 7:36











  • Maybe I can run the output of this through a second regex to fix those.

    – Matt McManis
    Mar 11 at 7:38
















18














Here is a regex which seems to work well, at least for your sample input:



(?<=[a-z])(?=[A-Z])|(?<=[0-9])(?=[A-Za-z])|(?<=[A-Za-z])(?=[0-9])|(?<=W)(?=W)


This patten says to make a split on a boundary of one of the following conditions:




  • what precedes is a lowercase, and what precedes is an uppercase (or
    vice-versa)

  • what precedes is a digit and what follows is a letter (or
    vice-versa)

  • what precedes and what follows is a non word character
    (e.g. quote, parenthesis, etc.)





string title = "ThisIsAnExampleTitleHELLO-WORLD2019T.E.S.T.(Test)[Test]"Test"'Test'";
string split = Regex.Split(title, "(?<=[a-z])(?=[A-Z])|(?<=[0-9])(?=[A-Za-z])|(?<=[A-Za-z])(?=[0-9])|(?<=\W)(?=\W)");
split = (from e in split select e.Trim()).ToArray();
string newtitle = string.Join(" ", split);

This Is An Example Title HELLO-WORLD 2019 T.E.S.T. (Test) [Test] "Test" 'Test'


Note: You might also want to add this assertion to the regex alternation:



(?<=W)(?=w)|(?<=w)(?=W)


We got away with this here, because this boundary condition never happened. But you might need it with other inputs.






share|improve this answer


























  • I ran into one issue, when it comes to single letter words like A and I, it will not separate because it uses the ALL UPPERCASE rule (two uppercase next to each other). ATitleExample becomes ATitle Example.

    – Matt McManis
    Mar 11 at 7:35






  • 1





    @MattMcManis This is an edge case which will potentially break all of the answers given here. You would need to do more work to cover such cses.a

    – Tim Biegeleisen
    Mar 11 at 7:36











  • Maybe I can run the output of this through a second regex to fix those.

    – Matt McManis
    Mar 11 at 7:38














18












18








18







Here is a regex which seems to work well, at least for your sample input:



(?<=[a-z])(?=[A-Z])|(?<=[0-9])(?=[A-Za-z])|(?<=[A-Za-z])(?=[0-9])|(?<=W)(?=W)


This patten says to make a split on a boundary of one of the following conditions:




  • what precedes is a lowercase, and what precedes is an uppercase (or
    vice-versa)

  • what precedes is a digit and what follows is a letter (or
    vice-versa)

  • what precedes and what follows is a non word character
    (e.g. quote, parenthesis, etc.)





string title = "ThisIsAnExampleTitleHELLO-WORLD2019T.E.S.T.(Test)[Test]"Test"'Test'";
string split = Regex.Split(title, "(?<=[a-z])(?=[A-Z])|(?<=[0-9])(?=[A-Za-z])|(?<=[A-Za-z])(?=[0-9])|(?<=\W)(?=\W)");
split = (from e in split select e.Trim()).ToArray();
string newtitle = string.Join(" ", split);

This Is An Example Title HELLO-WORLD 2019 T.E.S.T. (Test) [Test] "Test" 'Test'


Note: You might also want to add this assertion to the regex alternation:



(?<=W)(?=w)|(?<=w)(?=W)


We got away with this here, because this boundary condition never happened. But you might need it with other inputs.






share|improve this answer















Here is a regex which seems to work well, at least for your sample input:



(?<=[a-z])(?=[A-Z])|(?<=[0-9])(?=[A-Za-z])|(?<=[A-Za-z])(?=[0-9])|(?<=W)(?=W)


This patten says to make a split on a boundary of one of the following conditions:




  • what precedes is a lowercase, and what precedes is an uppercase (or
    vice-versa)

  • what precedes is a digit and what follows is a letter (or
    vice-versa)

  • what precedes and what follows is a non word character
    (e.g. quote, parenthesis, etc.)





string title = "ThisIsAnExampleTitleHELLO-WORLD2019T.E.S.T.(Test)[Test]"Test"'Test'";
string split = Regex.Split(title, "(?<=[a-z])(?=[A-Z])|(?<=[0-9])(?=[A-Za-z])|(?<=[A-Za-z])(?=[0-9])|(?<=\W)(?=\W)");
split = (from e in split select e.Trim()).ToArray();
string newtitle = string.Join(" ", split);

This Is An Example Title HELLO-WORLD 2019 T.E.S.T. (Test) [Test] "Test" 'Test'


Note: You might also want to add this assertion to the regex alternation:



(?<=W)(?=w)|(?<=w)(?=W)


We got away with this here, because this boundary condition never happened. But you might need it with other inputs.







share|improve this answer














share|improve this answer



share|improve this answer








edited Mar 11 at 6:05

























answered Mar 11 at 6:00









Tim BiegeleisenTim Biegeleisen

239k13100160




239k13100160













  • I ran into one issue, when it comes to single letter words like A and I, it will not separate because it uses the ALL UPPERCASE rule (two uppercase next to each other). ATitleExample becomes ATitle Example.

    – Matt McManis
    Mar 11 at 7:35






  • 1





    @MattMcManis This is an edge case which will potentially break all of the answers given here. You would need to do more work to cover such cses.a

    – Tim Biegeleisen
    Mar 11 at 7:36











  • Maybe I can run the output of this through a second regex to fix those.

    – Matt McManis
    Mar 11 at 7:38



















  • I ran into one issue, when it comes to single letter words like A and I, it will not separate because it uses the ALL UPPERCASE rule (two uppercase next to each other). ATitleExample becomes ATitle Example.

    – Matt McManis
    Mar 11 at 7:35






  • 1





    @MattMcManis This is an edge case which will potentially break all of the answers given here. You would need to do more work to cover such cses.a

    – Tim Biegeleisen
    Mar 11 at 7:36











  • Maybe I can run the output of this through a second regex to fix those.

    – Matt McManis
    Mar 11 at 7:38

















I ran into one issue, when it comes to single letter words like A and I, it will not separate because it uses the ALL UPPERCASE rule (two uppercase next to each other). ATitleExample becomes ATitle Example.

– Matt McManis
Mar 11 at 7:35





I ran into one issue, when it comes to single letter words like A and I, it will not separate because it uses the ALL UPPERCASE rule (two uppercase next to each other). ATitleExample becomes ATitle Example.

– Matt McManis
Mar 11 at 7:35




1




1





@MattMcManis This is an edge case which will potentially break all of the answers given here. You would need to do more work to cover such cses.a

– Tim Biegeleisen
Mar 11 at 7:36





@MattMcManis This is an edge case which will potentially break all of the answers given here. You would need to do more work to cover such cses.a

– Tim Biegeleisen
Mar 11 at 7:36













Maybe I can run the output of this through a second regex to fix those.

– Matt McManis
Mar 11 at 7:38





Maybe I can run the output of this through a second regex to fix those.

– Matt McManis
Mar 11 at 7:38











9














Aiming for simplicity rather than huge regex, I would recommend this code with small simple patterns (comments with explanation are in code):



string str = "ThisIsAnExampleTitleHELLO-WORLD2019T.E.S.T.(Test)"Test"'Test'[Test]";
// insert space when there is small letter followed by upercase letter
str = Regex.Replace(str, "(?<=[a-z])(?=[A-Z])", " ");
// insert space whenever there's digit followed by a ltter
str = Regex.Replace(str, @"(?<=d)(?=[A-Za-z])", " ");
// insert space when there's letter followed by digit
str = Regex.Replace(str, @"(?<=[A-Za-z])(?=d)", " ");
// insert space when there's one of characters ("'[ followed by letter or digit
str = Regex.Replace(str, @"(?=[([""'][a-zA-Z0-9])", " ");
// insert space when what preceeds is on of characters ])"'
str = Regex.Replace(str, @"(?<=[)]""'])", " ");





share|improve this answer
























  • If commenting was your main concern you could enable x-mode or use inline comments i.e. (?#insert space when there's letter followed by digit).

    – revo
    Mar 11 at 7:45






  • 2





    @revo I used standard C# comments :) I think it's more readable.

    – Michał Turczyn
    Mar 11 at 7:46






  • 2





    You could also write such kind of readable comments by setting standard x modifier which enables you to write multiline, indented perfect comments. It's not simple by the way. Just split. .

    – revo
    Mar 11 at 7:49


















9














Aiming for simplicity rather than huge regex, I would recommend this code with small simple patterns (comments with explanation are in code):



string str = "ThisIsAnExampleTitleHELLO-WORLD2019T.E.S.T.(Test)"Test"'Test'[Test]";
// insert space when there is small letter followed by upercase letter
str = Regex.Replace(str, "(?<=[a-z])(?=[A-Z])", " ");
// insert space whenever there's digit followed by a ltter
str = Regex.Replace(str, @"(?<=d)(?=[A-Za-z])", " ");
// insert space when there's letter followed by digit
str = Regex.Replace(str, @"(?<=[A-Za-z])(?=d)", " ");
// insert space when there's one of characters ("'[ followed by letter or digit
str = Regex.Replace(str, @"(?=[([""'][a-zA-Z0-9])", " ");
// insert space when what preceeds is on of characters ])"'
str = Regex.Replace(str, @"(?<=[)]""'])", " ");





share|improve this answer
























  • If commenting was your main concern you could enable x-mode or use inline comments i.e. (?#insert space when there's letter followed by digit).

    – revo
    Mar 11 at 7:45






  • 2





    @revo I used standard C# comments :) I think it's more readable.

    – Michał Turczyn
    Mar 11 at 7:46






  • 2





    You could also write such kind of readable comments by setting standard x modifier which enables you to write multiline, indented perfect comments. It's not simple by the way. Just split. .

    – revo
    Mar 11 at 7:49
















9












9








9







Aiming for simplicity rather than huge regex, I would recommend this code with small simple patterns (comments with explanation are in code):



string str = "ThisIsAnExampleTitleHELLO-WORLD2019T.E.S.T.(Test)"Test"'Test'[Test]";
// insert space when there is small letter followed by upercase letter
str = Regex.Replace(str, "(?<=[a-z])(?=[A-Z])", " ");
// insert space whenever there's digit followed by a ltter
str = Regex.Replace(str, @"(?<=d)(?=[A-Za-z])", " ");
// insert space when there's letter followed by digit
str = Regex.Replace(str, @"(?<=[A-Za-z])(?=d)", " ");
// insert space when there's one of characters ("'[ followed by letter or digit
str = Regex.Replace(str, @"(?=[([""'][a-zA-Z0-9])", " ");
// insert space when what preceeds is on of characters ])"'
str = Regex.Replace(str, @"(?<=[)]""'])", " ");





share|improve this answer













Aiming for simplicity rather than huge regex, I would recommend this code with small simple patterns (comments with explanation are in code):



string str = "ThisIsAnExampleTitleHELLO-WORLD2019T.E.S.T.(Test)"Test"'Test'[Test]";
// insert space when there is small letter followed by upercase letter
str = Regex.Replace(str, "(?<=[a-z])(?=[A-Z])", " ");
// insert space whenever there's digit followed by a ltter
str = Regex.Replace(str, @"(?<=d)(?=[A-Za-z])", " ");
// insert space when there's letter followed by digit
str = Regex.Replace(str, @"(?<=[A-Za-z])(?=d)", " ");
// insert space when there's one of characters ("'[ followed by letter or digit
str = Regex.Replace(str, @"(?=[([""'][a-zA-Z0-9])", " ");
// insert space when what preceeds is on of characters ])"'
str = Regex.Replace(str, @"(?<=[)]""'])", " ");






share|improve this answer












share|improve this answer



share|improve this answer










answered Mar 11 at 7:29









Michał TurczynMichał Turczyn

16.4k132241




16.4k132241













  • If commenting was your main concern you could enable x-mode or use inline comments i.e. (?#insert space when there's letter followed by digit).

    – revo
    Mar 11 at 7:45






  • 2





    @revo I used standard C# comments :) I think it's more readable.

    – Michał Turczyn
    Mar 11 at 7:46






  • 2





    You could also write such kind of readable comments by setting standard x modifier which enables you to write multiline, indented perfect comments. It's not simple by the way. Just split. .

    – revo
    Mar 11 at 7:49





















  • If commenting was your main concern you could enable x-mode or use inline comments i.e. (?#insert space when there's letter followed by digit).

    – revo
    Mar 11 at 7:45






  • 2





    @revo I used standard C# comments :) I think it's more readable.

    – Michał Turczyn
    Mar 11 at 7:46






  • 2





    You could also write such kind of readable comments by setting standard x modifier which enables you to write multiline, indented perfect comments. It's not simple by the way. Just split. .

    – revo
    Mar 11 at 7:49



















If commenting was your main concern you could enable x-mode or use inline comments i.e. (?#insert space when there's letter followed by digit).

– revo
Mar 11 at 7:45





If commenting was your main concern you could enable x-mode or use inline comments i.e. (?#insert space when there's letter followed by digit).

– revo
Mar 11 at 7:45




2




2





@revo I used standard C# comments :) I think it's more readable.

– Michał Turczyn
Mar 11 at 7:46





@revo I used standard C# comments :) I think it's more readable.

– Michał Turczyn
Mar 11 at 7:46




2




2





You could also write such kind of readable comments by setting standard x modifier which enables you to write multiline, indented perfect comments. It's not simple by the way. Just split. .

– revo
Mar 11 at 7:49







You could also write such kind of readable comments by setting standard x modifier which enables you to write multiline, indented perfect comments. It's not simple by the way. Just split. .

– revo
Mar 11 at 7:49













7














You could reduce the requirements to shorten the steps of a regular expression using a different interpretation of them. For example, the first requirement would be the same as to say, preserve capital letters if they are not preceded by punctuation marks or capital letters.



The following regex works almost for all of the mentioned requirements and may be extended to include or exclude other situations:



(?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}


You have to use Replace() method and use $0 as substitution string.



See live demo here



.NET (See it in action):



string input = @"ThisIsAnExample.TitleHELLO-WORLD2019T.E.S.T.(Test)""Test""'Test'[Test]";
Regex regex = new Regex(@"(?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}", RegexOptions.Multiline);
Console.WriteLine(regex.Replace(input, @" $0"));





share|improve this answer


























  • This is an interesting way. Which rule can be added to fix HELLO-WORLD2019 by spacing the 2019?

    – Matt McManis
    Mar 11 at 7:10






  • 1





    Add (?<=p{L})d within an alternation: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=p{L})d.

    – revo
    Mar 11 at 7:17











  • I have one other issue, single letter words like A and I won't space. ATitleExample becomes ATitle Example.

    – Matt McManis
    Mar 11 at 8:23











  • What about something like OTPIsADevice?

    – revo
    Mar 11 at 8:29











  • It starts to get complicated. OTPIs ADevice maybe I can run the output through a second filter. Rules: If a word starts with 2 Uppercase letters ADevice, add a space after the first letter A Device. And if an ALL UPPERCASE word ends in a lowercase letter OTPIs, add a space before the last two letters OTP Is.

    – Matt McManis
    Mar 11 at 8:54


















7














You could reduce the requirements to shorten the steps of a regular expression using a different interpretation of them. For example, the first requirement would be the same as to say, preserve capital letters if they are not preceded by punctuation marks or capital letters.



The following regex works almost for all of the mentioned requirements and may be extended to include or exclude other situations:



(?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}


You have to use Replace() method and use $0 as substitution string.



See live demo here



.NET (See it in action):



string input = @"ThisIsAnExample.TitleHELLO-WORLD2019T.E.S.T.(Test)""Test""'Test'[Test]";
Regex regex = new Regex(@"(?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}", RegexOptions.Multiline);
Console.WriteLine(regex.Replace(input, @" $0"));





share|improve this answer


























  • This is an interesting way. Which rule can be added to fix HELLO-WORLD2019 by spacing the 2019?

    – Matt McManis
    Mar 11 at 7:10






  • 1





    Add (?<=p{L})d within an alternation: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=p{L})d.

    – revo
    Mar 11 at 7:17











  • I have one other issue, single letter words like A and I won't space. ATitleExample becomes ATitle Example.

    – Matt McManis
    Mar 11 at 8:23











  • What about something like OTPIsADevice?

    – revo
    Mar 11 at 8:29











  • It starts to get complicated. OTPIs ADevice maybe I can run the output through a second filter. Rules: If a word starts with 2 Uppercase letters ADevice, add a space after the first letter A Device. And if an ALL UPPERCASE word ends in a lowercase letter OTPIs, add a space before the last two letters OTP Is.

    – Matt McManis
    Mar 11 at 8:54
















7












7








7







You could reduce the requirements to shorten the steps of a regular expression using a different interpretation of them. For example, the first requirement would be the same as to say, preserve capital letters if they are not preceded by punctuation marks or capital letters.



The following regex works almost for all of the mentioned requirements and may be extended to include or exclude other situations:



(?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}


You have to use Replace() method and use $0 as substitution string.



See live demo here



.NET (See it in action):



string input = @"ThisIsAnExample.TitleHELLO-WORLD2019T.E.S.T.(Test)""Test""'Test'[Test]";
Regex regex = new Regex(@"(?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}", RegexOptions.Multiline);
Console.WriteLine(regex.Replace(input, @" $0"));





share|improve this answer















You could reduce the requirements to shorten the steps of a regular expression using a different interpretation of them. For example, the first requirement would be the same as to say, preserve capital letters if they are not preceded by punctuation marks or capital letters.



The following regex works almost for all of the mentioned requirements and may be extended to include or exclude other situations:



(?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}


You have to use Replace() method and use $0 as substitution string.



See live demo here



.NET (See it in action):



string input = @"ThisIsAnExample.TitleHELLO-WORLD2019T.E.S.T.(Test)""Test""'Test'[Test]";
Regex regex = new Regex(@"(?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}", RegexOptions.Multiline);
Console.WriteLine(regex.Replace(input, @" $0"));






share|improve this answer














share|improve this answer



share|improve this answer








edited Mar 11 at 7:12

























answered Mar 11 at 7:06









revorevo

34.3k135188




34.3k135188













  • This is an interesting way. Which rule can be added to fix HELLO-WORLD2019 by spacing the 2019?

    – Matt McManis
    Mar 11 at 7:10






  • 1





    Add (?<=p{L})d within an alternation: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=p{L})d.

    – revo
    Mar 11 at 7:17











  • I have one other issue, single letter words like A and I won't space. ATitleExample becomes ATitle Example.

    – Matt McManis
    Mar 11 at 8:23











  • What about something like OTPIsADevice?

    – revo
    Mar 11 at 8:29











  • It starts to get complicated. OTPIs ADevice maybe I can run the output through a second filter. Rules: If a word starts with 2 Uppercase letters ADevice, add a space after the first letter A Device. And if an ALL UPPERCASE word ends in a lowercase letter OTPIs, add a space before the last two letters OTP Is.

    – Matt McManis
    Mar 11 at 8:54





















  • This is an interesting way. Which rule can be added to fix HELLO-WORLD2019 by spacing the 2019?

    – Matt McManis
    Mar 11 at 7:10






  • 1





    Add (?<=p{L})d within an alternation: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=p{L})d.

    – revo
    Mar 11 at 7:17











  • I have one other issue, single letter words like A and I won't space. ATitleExample becomes ATitle Example.

    – Matt McManis
    Mar 11 at 8:23











  • What about something like OTPIsADevice?

    – revo
    Mar 11 at 8:29











  • It starts to get complicated. OTPIs ADevice maybe I can run the output through a second filter. Rules: If a word starts with 2 Uppercase letters ADevice, add a space after the first letter A Device. And if an ALL UPPERCASE word ends in a lowercase letter OTPIs, add a space before the last two letters OTP Is.

    – Matt McManis
    Mar 11 at 8:54



















This is an interesting way. Which rule can be added to fix HELLO-WORLD2019 by spacing the 2019?

– Matt McManis
Mar 11 at 7:10





This is an interesting way. Which rule can be added to fix HELLO-WORLD2019 by spacing the 2019?

– Matt McManis
Mar 11 at 7:10




1




1





Add (?<=p{L})d within an alternation: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=p{L})d.

– revo
Mar 11 at 7:17





Add (?<=p{L})d within an alternation: (?<!^|[A-Zp{P}])[A-Z]|(?<=p{P})p{P}|(?<=p{L})d.

– revo
Mar 11 at 7:17













I have one other issue, single letter words like A and I won't space. ATitleExample becomes ATitle Example.

– Matt McManis
Mar 11 at 8:23





I have one other issue, single letter words like A and I won't space. ATitleExample becomes ATitle Example.

– Matt McManis
Mar 11 at 8:23













What about something like OTPIsADevice?

– revo
Mar 11 at 8:29





What about something like OTPIsADevice?

– revo
Mar 11 at 8:29













It starts to get complicated. OTPIs ADevice maybe I can run the output through a second filter. Rules: If a word starts with 2 Uppercase letters ADevice, add a space after the first letter A Device. And if an ALL UPPERCASE word ends in a lowercase letter OTPIs, add a space before the last two letters OTP Is.

– Matt McManis
Mar 11 at 8:54







It starts to get complicated. OTPIs ADevice maybe I can run the output through a second filter. Rules: If a word starts with 2 Uppercase letters ADevice, add a space after the first letter A Device. And if an ALL UPPERCASE word ends in a lowercase letter OTPIs, add a space before the last two letters OTP Is.

– Matt McManis
Mar 11 at 8:54




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55095949%2fseparate-title-string-with-no-spaces-into-words%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Aardman Animations

Are they similar matrix

“minimization” problem in Euclidean space related to orthonormal basis