I have a bunch of text strings in Excel 2016. I am trying to search for various substrings. To be more helpful for reviewers, I want to return the word, delimited by spaces, which contains that substring.
For example, if my string is Will be retiring in June
and my substring is ret
, I would like the formula to return retiring
. This will allow reviewers to distinguish retired
from retiring
or return
.
I have written the below formula. There are multiple main strings to search. They are in columns H - M. There are multiple substrings to search for; they are in cells AQ1 - BF1. I would like the found word to populate the AQ2 - BF60000.
AP1 will determine which main column to search. If cell AP1 is H, then AQ2 will return the word that AQ$1 is found in from cell $H2. If AP1 is changed to I, AQ2 would return the word from cell $I2. This was simply to cut down on the number of formulas needed, looking at only one column at a time.
Here's a picture.
H | I | ... | AP | AQ | AR | AS | ... | |
---|---|---|---|---|---|---|---|---|
1 | Work Status 1 | Work Status 2 | ... | H | ret | vol | unem | ... |
2 | Out on disability | FD volunteeer | ... | #VALUE! | #VALUE! | #VALUE! | ... | |
3 | Still unemployed | Returned April 2022 | ... | #VALUE! | #VALUE! | unemployed | ... | |
4 | Hasn't returned, volunteering | Will be retiring 2023 | ... | returned | volunteering | #VALUE! | ... | |
5 | Retired 2017 | ... | Retired | #VALUE! | #VALUE! | ... |
But if I change AP1 to be I instead of H, the table changes to
H | I | ... | AP | AQ | AR | AS | ... | |
---|---|---|---|---|---|---|---|---|
1 | Work Status 1 | Work Status 2 | ... | I | ret | vol | unem | ... |
2 | Out on disability | FD volunteeer | ... | #VALUE! | volunteeer | #VALUE! | ... | |
3 | Still unemployed | Returned April 2022 | ... | Returned | #VALUE! | #VALUE! | ... | |
4 | Hasn't returned, volunteering | Will be retiring 2023 | ... | retiring | #VALUE! | #VALUE! | ... | |
5 | Retired 2017 | ... | #VALUE! | #VALUE! | #VALUE! | ... |
The formula works, but it is a little slow, since I have about a million total cells to calculate. Is there any way this could be optimized or rewritten completely? This is a work project, so I have no ability to upgrade past Excel 2016 or install anything.
=MID(INDIRECT($AP$1&ROW()),
This finds the beginning of the word (finds first space beforehand and adds 1).
IFERROR(FIND("|",
SUBSTITUTE(LEFT(INDIRECT($AP$1&ROW()),
SEARCH(AQ$1,INDIRECT($AP$1&ROW()))
),
" ",
"|",
LEN(LEFT(INDIRECT($AP$1&ROW()),
SEARCH(AQ$1,INDIRECT($AP$1&ROW()))
)
)
- LEN(SUBSTITUTE(LEFT(INDIRECT($AP$1&ROW()),
SEARCH(AQ$1,INDIRECT($AP$1&ROW()))
),
" ",
""
)
)
)
),
0
) + 1,
This determines the number of characters from the start of the word to the end.
SEARCH(" ",
INDIRECT($AP$1&ROW()),
SEARCH(AQ$1,INDIRECT($AP$1&ROW()))
)
- IFERROR(FIND("|",
SUBSTITUTE(LEFT(INDIRECT($AP$1&ROW()),
SEARCH(AQ$1,INDIRECT($AP$1&ROW()))
),
" ",
"|",
LEN(LEFT(INDIRECT($AP$1&ROW()),
SEARCH(AQ$1,INDIRECT($AP$1&ROW()))
)
)
- LEN(SUBSTITUTE(LEFT(INDIRECT($AP$1&ROW()),
SEARCH(AQ$1,INDIRECT($AP$1&ROW()))
),
" ",
""
)
)
)
),
0
) - 1
)
Indirect
toindex
. Good luck.