Excel substring: how to extract text from cell

The tutorial shows how to use the Substring functions in Excel to extract text from a cell, get a substring before or after a specified character, find cells containing part of a string, and more.

Before we start discussing different techniques to manipulate substrings in Excel, let's just take a moment to define the term so that we can begin on the same page. So, what is a substring? Simply, it's part of a text entry. For example, if you type something like "AA-111" in a cell, you'd call it an alphanumeric string, and any part of the string, say "AA", would be a substring.

Although there is no such thing as Substring function in Excel, there exist three Text functions (LEFT, RIGHT, and MID) to extract a substring of a given length. Also, there are FIND and SEARCH functions to get a substring before or after a specific character. And, there are a handful of other functions to perform more complex operations such as extracting numbers from a string, replacing one substring with another, looking up partial text match, etc. Below you will find formula examples to do all this and a lot more.

How to extract substring of a certain length

Microsoft Excel provides three different functions to extract text of a specified length from a cell. Depending on where you want to start an extraction, use one of these formulas:

  • LEFT function - to extract a substring from the left.
  • RIGHT function - to extract text from the right.
  • MID function - to extract a substring from the middle of a text string, starting at the point you specify.

As is the case with other formulas, Excel substring functions are best to learn from an example, so let's look at a few ones.

Extract substring from start of string (LEFT)

To extract text from the left of a string, you use the Excel LEFT function:

LEFT(text, [num_chars])

Where text is the address of the cell containing the source string, and num_chars is the number of characters you want to extract.

For example, to get the first 4 characters from the beginning of a text string, use this formula:

=LEFT(A2,4)

LEFT formula to extract a substring from the start of a string

Get substring from end of string (RIGHT)

To get a substring from the right part of a text string, go with the Excel RIGHT function:

RIGHT(text, [num_chars])

For instance, to get the last 4 characters from the end of a string, use this formula:

=RIGHT(A2,4)

RIGHT formula to extract a substring from the end of a string

Extract text from middle of string (MID)

If you are looking to extract a substring starting in the middle of a string, at the position you specify, then MID is the function you can rely on.

Compared to the other two Text functions, MID has a slightly different syntax:

MID(text, start_num, num_chars)

Aside from text (the original text string) and num_chars (the number of characters to extract), you also indicate start_num (the starting point).

In our sample data set, to get three characters from the middle of a string beginning with the 6th character, you use the following formula:

=MID(A2,6,3)

MID formula to extract text from the middle of a string

Tip. The output of the Right, Left and Mid formulas is always text, even when you are extracting a number from a text string. If you want to operate on the result as a number, then wrap your formula in the VALUE function like this: =VALUE(MID(A2,6,3))

Extract substring before or after a given character

As shown in the above examples, the Left, Right and Mid functions cope nicely with uniform strings. When you are dealing with text strings of variable length, more complex manipulations shall be needed.

Note. In all of the below examples, we will be using the case-insensitive SEARCH function to get the position of a character. If you want a case-sensitive formula, use the FIND function instead.

How to extract text before a specific character

To get a substring preceding a given character, two things are to be done: first, you determine the position of the character of interest, and then you pull all characters before it. More precisely, you use the SEARCH function to find the position of the character, and subtract 1 from the result, because you don't want to include the character itself in the output. And then, you send the returned number directly to the num_chars argument of the LEFT function:

LEFT(cell, SEARCH("char", cell)-1)

For example, to extract a substring before the hyphen character (-) from cell A2, use this formula:

=LEFT(A2, SEARCH("-",A2)-1)

No matter how many characters your Excel string contains, the formula only extracts text before the first hyphen:
Extracting text before a specific character

How to extract text after character

To get text following a specific character, you use slightly different approach: get the position of the character with either SEARCH or FIND, subtract that number from the total string length returned by the LEN function, and extract that many characters from the end of the string.

RIGHT(cell,LEN(cell)-SEARCH("char", cell))

In our example, we'd use the following formula to extract a substring after the first hyphen:

=RIGHT(A2,LEN(A2)-SEARCH("-",A2))

Extracting text after a specific character

How to extract text between two instances of a character

To get a substring between two occurrences of a certain character, use the following generic formula:

MID(cell, SEARCH("char", cell)+1, SEARCH ("char", cell, SEARCH ("char"cell)+1) - SEARCH ("char", cell)-1)

The first two arguments of this MID formula are crystal clear:

Text is the cell containing the original text string.

Start_num (starting point) - a simple SEARCH formula returns the position of the desired character, to which you add 1 because you want to start extraction with the next character.

Num_chars (number of chars to extract) is the trickiest part:

  • First, you work out the position of the second occurrence of the character by nesting one Search function within another.
  • After that, you subtract the position of the 1st occurrence from the position of the 2nd occurrence, and subtract 1 from the result since you don't want to include the delimiter character in the resulting substring.

For example, to extract text surrounded by two hyphens, you'd use this formula:

=MID(A2, SEARCH("-",A2) + 1, SEARCH("-",A2,SEARCH("-",A2)+1) - SEARCH("-",A2) - 1)

The screenshot below shows the result:
Extracting text between two hyphens

If you are looking to extract text between 2nd and 3rd or 3nd and 4th occurrences of the same character, you can use a more compact SEARCH SUBSTITUTE combination to get the character's position, as explained in How to find Nth occurrence of a character in a string:

FIND(CHAR(1),SUBSTITUTE(cell,character,CHAR(1),Nth occurrence))

In our case, we could extract a substring between the 2nd and 3rd hyphens with the following formula:

=MID(A2, FIND(CHAR(1),SUBSTITUTE(A2,"-",CHAR(1),2))+1, FIND(CHAR(1),SUBSTITUTE(A2,"-",CHAR(1),3)) - FIND(CHAR(1),SUBSTITUTE(A2,"-",CHAR(1),2))-1)

Extracting a substring between the 2nd and 3rd hyphens

How to find substring in Excel

In situations when you don't want to extract a substring and only want to find cells containing it, you use the SEARCH or FIND function as shown in the above examples, but perform the search within the ISNUMBER function. If a cell contains the substring, the Search function returns the position of the first character, and as long as ISNUMBER gets any number, it returns TRUE. If the substring is not found, the search results in an error, forcing ISNUMBER to return FALSE.

ISNUMBER(SEARCH("substring", cell))

Supposing, you have a list of British postcodes in column A and you want to find those that contain the substring "1ZZ". To have it done, use this formula:

=ISNUMBER(SEARCH("1zz", A2))

The results will look something similar to this:
Finding cells that contain a specific substring

If you'd like return your own message instead of the logical values of TRUE and FALSE, nest the above formula into the IF function:

=IF(ISNUMBER(SEARCH("1zz", A2)), "Yes", "")

If a cell contains the substring, the formula returns "Yes", an empty string ("") otherwise:
Formula to identify cells that contain a certain substring

As you may remember, the Excel SEARCH function is case-insensitive, so you use it when the character case does not matter. To get your formula to distinguish the uppercase and lowercase characters, opt for the case-sensitive FIND function.

For more information on how to find text and numbers in Excel, please see If cell contains formula examples.

To have a closer look at the formulas discussed in this tutorial, you are welcome to download our sample Excel Substring Functions workbook.

How to extract text from cell with Ultimate Suite for Excel

As you have just seen, Microsoft Excel provides an array of different functions to work with text strings. In case you are unsure which function is best suited for your needs, commit the job to our Ultimate Suite for Excel. With these tools in your Excel's arsenal, you just go to Ablebits Data tab > Text group, and click Extract:
Extract Text tool in Excel

Now, you select the source cells, and whatever complex strings they contain, a substring extraction boils down to these two simple actions:

  1. Specify how many characters you want to get from the start, end or middle of the string; or choose to extract all text before or after a given character.
  2. Click Insert Results. Done!

For example, to pull the domain names from the list of email addresses, you select the All after text radio button and type @ in the box next to it. To extract the user names, you select the All before text radio button, as shown in the screenshot below.
Define the substring you want to extract.

And you will get the following results in a moment:
Parts of email addresses are extracted in separate columns.

Apart from speed and simplicity, the Extract Text tool has extra value - it will help you learn Excel formulas in general and substring functions in particular. How? By selecting the Insert as formula checkbox at the bottom of the pane, you ensure that the results are output as formulas, not values.

In this example, if you select cells B2 and C2, you will see the following formulas, respectively:

  • To extract username:

    =IFERROR(LEFT(A2,SEARCH("@",A2)-1),"")

  • To extract domain:

    =IFERROR(RIGHT(A2, LEN(A2)- SEARCH("@",A2) - LEN("@") + 1),"")

How much time would it take you to figure out these formulas on your own? ;)

Since the results are formulas, the extracted substrings will update automatically as soon as any changes are made to the original strings. When new entries are added to your data set, you can copy the formulas to other cells as usual, without having to run the Extract Text tool anew.

If you are curious to try this as well as many other useful features included with Ultimate Suite for Excel, you are welcome to download 15-day trial version. And if you like the tools and decide to get a license, be sure to grab the 15% off coupon code that we provide exclusively to our blog readers: AB14-BlogSpo.

More formulas for substrings in Excel

In this tutorial, we have demonstrated some classic Excel formulas to extract text from string. As you understand, there can be almost infinite variations of these basic scenarios. Below you will find a few more formula examples where the Text functions come in handy.

34 Responses to "Excel substring: how to extract text from cell"

  1. Steven says:

    Nice primer, however, extracting a number from a string leaves me with another string. When I convert these strings to numbers via format I can't operate on them as numbers, they are still strings?

    • Hi Steven,

      Right, whenever you use a Text function (Right, Left, Mid) to extract something, the output is always text. In case of extracting a number from a string, the result is a numeric substring, which in terms of Excel is also text, not number. To convert it to number, you can wrap your formula in the VALUE function. Here's an example in the simplest form:

      =VALUE(LEFT(A2,4))

      And thank you for a smart question! I've added this tip to the article.

  2. Hank says:

    When I download weekly the raw date to an excel spreadsheet I was looking for a way to extract the name from the name-and-initials/initials-and-number from the raw data. Your tip (Left(cell, Search("char",cell)-1) is excellent. Below are some fictitious names:
    Hofman, Todd (th81)
    Medemblik, Terry (tjm)
    Williams, Mark (mrw)
    Vanderzwaag, Ron (rvz)
    Linde, Tim (tl1074)
    Nijenhuis, Andy (andy)
    Verhey, Tim (tmv)
    Moesker, James (jm1181)
    Martin, Nevin (nwm)
    Swaving, Brad (bs776)

  3. Julia Leibbrand says:

    "Prescriber: Mitchell CNM H Becky

    Patient: Patient Name

    Medication: FE TABS 325 (65 FE) MG ORAL TABLET DELAYED RELEASE; Qty: 60 Refills: PRN

    Ordered: 29-Dec-2017@1130

    Signed/Transmitted: 29-Dec-17@1130

    Pharmacy: Access Family Care Pharmacy*
    530 South Maiden Lane
    Joplin, MO
    Ph: 4177827209
    Fax: 4177827727"

    Above is a text field in an excel spreadsheet. I need to make the medication in a field by itself.

    Can you tell me how to pull what is after Medication:?

  4. RK says:

    The search function you describe does not work at all.

  5. Vic says:

    Challenge: I need to extract a string between two other strings (a word and a character) within a cell, but those other strings may repeat inside the cell.

    For example - a scanning tool conducts 10 tests and returns a failure for one but includes all the Passes in the output. The test results are separated within each cell with *****

    multi-line cell example:

    Test: Endpoint
    Result: Pass
    Details: Endpoint Connected
    *****
    Test: Resolution
    Result: Fail
    Details: Unable to resolve FQDN
    *****
    Test: Port
    Result: Fail
    Details: Port 1234 does not respond

    First thought was re-organize the output using text to columns butfor whatever reason it doesn't work (even after replacing the ***** with a single char like ^ (I'm guessing it's thrown off by the carriage returns within the cell).

    So we're left with searching for some combination of Mid, Len, Search/Find but it's proving extremely difficult. It's easy to find the first instance of "Failed", but hard to then find the stopping point which has to be the very next ^ char after "Failed", while also then being able to repeat the extraction for any other Failed tests inside the cell.

  6. pratik says:

    How to get first name from below name list

    aakash kumar, Arvind soni, Tejas Sawant

    I want to extract first name "Aaskash kumar"? How to do this? Kindly help me?

  7. Abraham says:

    Hi,
    wanting to understand how to implement Case statement in Excel 2010. I have more than 20 conditions to match and if else is not helping me out. also I have researched in the internet they have asked me to make use of Choose function witch do not work either. could you please help how I can overcome the problem. here is some sample code that I have come up with to resolve my issue.
    =CHOOSE(
    (
    ( 2 * COUNT(FIND("ID",AR2))>0) +
    ( 4 * COUNT(FIND("CD",AR2))>0) +
    ( 8 * COUNT(FIND("DTE",AR2))>0) +
    ( 16 * COUNT(FIND("PH",AR2))>0) +
    ( 32 * COUNT(FIND("AMT",AR2))>0) +
    ( 64 * COUNT(FIND("PER",AR2))>0) +
    ( 128 * COUNT(FIND("TOT",AR2))>0) +
    ( 256 * COUNT(FIND("NME",AR2))>0) +
    ( 512 * COUNT(FIND("NUM ",AR2))>0) +
    ( 1024 * COUNT(FIND("DESC",AR2))>0) +
    ( 2048 * COUNT(FIND("TYPE",AR2))>0) +
    ( 4096 * COUNT(FIND("ADDR",AR2))>0) +
    ( 8192 * COUNT(FIND("EMAIL",AR2))>0) +
    ( 16384 * COUNT(FIND("RATE",AR2))>0) +
    ( 32768 * COUNT(FIND("PCT",AR2))>0) +
    ( 65536 * COUNT(FIND("NOTE",AR2))>0) +
    ( 131072 * COUNT(FIND("ZIP",AR2))>0) +
    ( 262144 * COUNT(FIND("STE",AR2))>0) +
    ( 524288 * COUNT(FIND("TITLE",AR2))>0) +
    ( 1048576 * COUNT(FIND("SUM",AR2))>0)
    ),
    "I_",
    "C_",
    "D_",
    "T_",
    "A_",
    "P_",
    "A_",
    "T_",
    "I_",
    "T_",
    "C_",
    "T_",
    "T_",
    "P_",
    "P_",
    "T_",
    "T_",
    "T_",
    "T_",
    "Q_"
    )

  8. Matt says:

    I have a string (05/21/201806:59:0511.311.311.711.3) and wanting to break the data up. I know I can do Left(), Mid() and Right() to get what I want. But what I am trying to find a way to copy/paste without the formula in the final cell. I will need to be able to copy the separated data into another spreadsheet. Is there a way to paste into another cell without the formula in it?

    • Doug says:

      Matt:
      If I understand your question correctly I would say you can right click choose Paste Special, then choose the Value button. No formulas, only values.

  9. Paul says:

    Thanks - I tested out the functions and learned something new. I can extract the JPN or GB from "Chatoyancy(JPN)" or Morgan's Mambo(GB) - BUT I really need to separate the text so I am left with "Chatoyancy" and "JPN", or "Morgan's Mambo" and "GB".

  10. Paul says:

    Regarding my previous post - I'm trying Text to Columns - awkward, but I think I got it.

  11. Bode says:

    SCPA-2F1ASO-PAC01-JB01-PA-100 How can i extract data between the 2nd and third hyphens to different cell and also the third and fourth unyo a differnt cell.

    Many thanks

    • Doug says:

      Bode:
      You can use the Text-to-Columns tool. It's under the Data tab. Just enter"-" as the delimiter.
      Or you can enter this into an empty cell and copy it across as many columns as needed. My test data was in A1, so you might want to change that address.
      You can also copy it down as many rows as needed.
      =TRIM(MID(SUBSTITUTE($A1,"-",REPT(" ",100)),(COLUMNS($A$1:A$1)-COLUMN($A$1))*100+1+SUMPRODUCT(LEN($A1:A1))-LEN($A1),100))

  12. julia says:

    hi
    how can i change format 243F160918D1 to {24, 3f, 16, 09, 18, d1}?

    • Doug says:

      Julia:
      Not sure how to format a cell to display the number in the manner you want, but here's a formula to add a comma after every other character.
      Where the original text is in A83:
      =LEFT(A83,2)&","&MID(A83,3,2)&","&MID(A83,5,2)&","&MID(A83,7,2)&","&MID(A83,9,2)&","&MID(A83,11,2)
      As you can see we start at the left of the text in A83 return the first two characters and then concatenate a comma. Then we use the MID function to concatenate the text in A83 beginning with the third character and return the next two characters and concatenate a comma.
      Continue with this same process until all characters have been added to the new string which contains a comma after every other character except the last one.
      You can change the cell addresses, the number of characters to skip and/or return and the inserted character to suit your needs.

  13. Sid says:

    Hi,
    I have a sheet with a list of docs (>500k), with their respective path. I want to extract the folder name from the path. Since, it has folders and sub-folders, the folder name doesnt come at the same position. The path is reported as "E:\abc\defg\...\xyz\". Basically, I'm looking to extract the text between the last two "\" i.e. "xyz".
    Thanks in advance

  14. Abdul Musawwir says:

    how can i seperate out cheque no in a seperate cell plz quide me. the data in cel is:
    Cell data
    Cash Withdrawal # 58989,CHQ
    00060706,AHSAN JADOON
    4240113035499 CHQ 00060706 DHA SHAHBAZ BRANCH,4240113035499

  15. Doug says:

    Abdul:
    If the data is in H37 as one long string, enter this in an empty cell.
    =MID(H37,SEARCH("CHQ",H37)+4,8)
    It says, in H37 find the first "CHQ" go forward 4 characters which includes the "CHQ" and a space and return the next 8 characters.

  16. Emmylou says:

    Hi, how do I remove a decimal point for a number 04.80 to be converted to 0480?
    Thanks for your help.

  17. Naomi says:

    I get reports where the name format isn't consistent -- some are FirstName LastName and others are LastName, FirstName.

    Can I use an IF statement combined with your extract formulas to create a single formula that extracts LastName from either format?

    Like this: IF(entry contains comma, LEFT formula, RIGHT formula)

    Would that work?

    And thanks for the great examples and explanations!

    • Hi Naomi,

      This idea has never occurred to me, but I suppose yes, it will work.

      Here's the formula to extract the last name:
      =IF(ISNUMBER(SEARCH(",",A1)), LEFT(A1, SEARCH(",", A1)-1), RIGHT(A1,LEN(A1)-SEARCH(" ",A1,1)))

      If you'd like to get the first name too, you can find the appropriate LEFT and RIGHT formulas in this tutorial: How to split names in Excel.

  18. Naomi says:

    Many thanks! The last name extract worked beautifully -- I'll try the first name next.

    Doug also sent me information -- I really appreciate everyone's help.

  19. Sumesh Kurichiveettil says:

    i have a string in one cell with length of 400 characters ( A1) . Is there a way i can extract first 150 characters in ( B1) one cell, next 150 in (C1)2nd cell and balance in 3rd (D1)

    Thanks

    Sumesh

    • Doug says:

      Sumesh:
      The answer to your question is easy. My only question is why are there 400 characters in one cell?

      In B1 =LEFT(A1,150)
      In C1 =MID(A1,151,150)
      in D1 =RIGHT(A1,100)

  20. Saura Tripathy says:

    Hi,

    Need some help with extracting data from cell with multiple lines. For example below data is in a single cell. This is a report generated against multiple software with found version and existing version. There are thousands of cell with similar data with "Found: xxxxx". I would like to know a way to get ONLY "Found: xxxxx" data into another column. Would that be possible?

    --------------------------------
    Tested: C:\Windows\Microsoft.NET\Framework\v4.0.30319\mscorlib.dll

    Found: 4.7.3130.0

    Context:

    Fix: Install the appropriate patch from Microsoft or through Windows Update.
    --------------------------------

  21. Subbarao says:

    I have a data format like this "QA_CASE_1234_John_Smith" How do I extract John Smith in a separate cell without any underscore in John Smith name.

  22. Andrew says:

    If I have this string:

    Babababa{njnj}v_ANDREWGORT@andrew@gort.com

    How do I extract the 'ANDREW' alone?

    This is what I have:

    =MID(C2,SEARCH("v_",C2)+2,SEARCH("GORT",C2, SEARCH("v_",C2)+2) - SEARCH("GORT",C2)-1)

    It appears that "GORT" doesn't work. Is there any way to do this?

  23. Ashish Thapa says:

    I want to extract last 2 word before and after "-" in one formula
    eg.
    Abc-ccc-ddd, here I need ccc-ddd

    Fff-hhh-kkk-yyy, here I need kkk-yyy

    Kkkk-hhh, here I need kkkk-hhh

  24. Marcus Gallagher says:

    Hello i'm trying to extract dimension from an item name column

    E.g. item names are all in a column and one would be "White fence post 80x80 1.8M" but another might be "RT1 35x150 treated decking 4.8MT"

    In these cases how do i pull out just those numbers?

  25. Joanne says:

    I am trying extract only the GB information from a cell in Excel. However in my report, there is an unknown number of times that the GB information will appear and I have at least 2000 lines to go through.

    A cell could look like this:
    0 GB (0% or 1.0:1), 9 GB (31% or 1.5:1)
    or like this:
    453 GB (29% or 1.4:1), 728.25 GB (33% or 1.5:1), 422.25 GB (33% or 1.5:1), 923.5 GB (42% or 1.7:1), 705.5 GB (33% or 1.5:1)

    I want to extract each occurrence of the GB amount (i.e. 0 GB, 9 GB from the first example above and 453 GB, 728.25 GB, 422.25 GB, 923.5 GB, 705.25 GB from the 2nd example above)

    The ultimate goal is to be able to add the GB amounts together for each cell and get a total. I tried splitting the text, but because some of the cells have so many occurrences of GB's it didn't seem like a good choice.

    Does anyone have any ideas?
    Thank you in advance,
    Joanne

Post a comment



Unfortunately, due to the volume of comments received we cannot guarantee that we will be able to give you a timely response. When posting a question, please be very clear and concise. We thank you for understanding!
Excel add-ins and Outlook tools - Ablebits.com
40
Ultimate Suite 2018.5 for Excel
40
Christmas sale
Best Price of the Year. Dec. 18 – 28