how is the "find" algorithm in text editors constructed?  | |
October 1st, 2003, 09:25 PM
|
#1 (permalink)
| | Senior Member
Join Date: Oct 2001
Posts: 881
| how is the "find" algorithm in text editors constructed?
I'm going to have to write a program that parses through a very large text file looking for an instance of a specific word....
in the past, I've just read it line by line, and searched each token for a match to what I want... however, this seems like it'd be really really slow. So I'm curious how programs like wordpad/notepad, etc. do it. I feel like I could really cut down on a lot of wasted time if I could mimic that implementation...
so I'm looking for suggestions or any sort of help that you have to offer!
thanks in advance,
-Z |
| |
October 2nd, 2003, 09:56 AM
|
#2 (permalink)
| | may contain mild peril
Join Date: Oct 2001 Location: UK
Posts: 3,329
|
I would select your open source text editor of choice, grab its source code and have a quick look how they implement search functions. I had a quick check and vim, emacs and nano all have a search.c that should give you some ideas, link lists seem to be mentioned.
Regards
ed
__________________
I dreamt that a large eagle circled the room three times and then got into bed with me and took all the blankets.
|
| |
October 3rd, 2003, 01:30 AM
|
#3 (permalink)
| | Banned
Join Date: Sep 2003
Posts: 35
| Quote: |
however, this seems like it'd be really really slow.
| was it really slow in the past? like, when you did it?
text editors simply search the string/stream/char_array that is in memory. no reason for file/io, just memory search. |
| |
October 3rd, 2003, 01:59 AM
|
#4 (permalink)
| | Ultimate Member
Join Date: Oct 2001 Location: Montreal, QC
Posts: 1,950
|
It depends on the language you're using. I think the way I'd do it is to write it in Perl using a regular expression and make it into a module.. I'm not sure whether other languages have regexp or not. |
| |
October 3rd, 2003, 06:56 PM
|
#5 (permalink)
| | Senior Member
Join Date: Oct 2001
Posts: 881
|
So I'm doing this in java right now, but I'll probably convert it to C or C++ (more likely) in the future.
The files that I'm going to be searching through are going to be multiple thousands of lines of code... possibly up to 100k+ lines. (these are the output files of electronic structure calculations on various molecules). The output is so long because this technique is a self-consistant field (SCF) calculation, so it takes the results from the current run - prints them - then reinserts them into the new run. These repeats until the convergence tolerence is reached....
anyway... it's sort of beside the point. I'm looking for a way to rip a chunk of the output from the code (the final output actual) and then I'll be doing calculations based on the output that I take....
I was just hoping to find a faster/more efficient way to get to the information that I want then going line by line through these gigantic files.
any other ideas are invited and appreciated.
thanks
-Z |
| |
October 4th, 2003, 12:48 AM
|
#6 (permalink)
| | Banned
Join Date: Sep 2003
Posts: 35
| Quote: |
any other ideas are invited and appreciated.
| except mine... Quote: |
I was just hoping to find a faster/more efficient way to get to the information that I want then going line by line through these gigantic files.
| stop searching, line by line, read the gigantic file at once, then search for token!
how slow was it in the past? |
| |
October 4th, 2003, 01:03 AM
|
#7 (permalink)
| | Perfetc Member
Join Date: Jan 2003 Location: Maryland Suburbia
Posts: 4,334
|
Try searching the web for source code on "find" functions |
| | | Thread Tools | Search this Thread | | | | |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | | | | Most Active Discussions | | | | | Recent Discussions  | | | | | |