You might wanna make a PR, this sounds good.

graycat · on Jan 8, 2019

Wow! Sounds interesting. Alas, I've been programming too much with my head down and actually don't know what a PR is. What's a PR??

ncmncm · on Jan 8, 2019

"Pull request". If you expect to get ahead programming you are going to need to know about this.

graycat · on Jan 8, 2019

Thanks.

So far I haven't had to use GitHub.

I've done a LOT of computing for a LONG time. The code for my startup runs fine. I Want to do a few, easy late tweaks before going live.

I have the first server running and am doing a few last little things.

The code is 100,000 lines of typing, 24,000 programming language statements, nearly all in Microsoft's Visual Basic .NET.

The next major step is to gather a lot of data for my database, much more than the data I have for testing now.

I taught computing at Georgetown and Ohio State, published in artificial intelligence and more in computer science and applied math at IBM's Watson lab, and cooked up some maybe new algorithms for my startup.

But I've never needed to use GitHub! I'm not against it, but so far the computing unique to my project has been fast, fun, and easy for me. The main bottleneck has been poor documentation. Then my development computer got sick, then I got sick, but now both I and my computers are all healthy!

The times I tried GitHub, it said that my Web browser is out of date! I have the latest from Firefox and was using my HP laptop with Windows 10 64 bit Home Edition -- what came with the laptop I got just to have something to order parts and gather information for my first server after my development computer quit -- some motherboard data corruption problem.

My server is an Asus old BIOS style motherboard with an AMD FX-8350 processor, 8 cores with standard clock at 4.0 GHz, 16 GB of DDR3 ECC main memory, 7 TB of hard disk, and Windows 7 64 bit Professional. If my startup gets traffic enough to keep that server busy, then I'll have a nice step up!

Especially since I'm using Microsoft, I'll likely use GitHub eventually.

pstuart · on Jan 8, 2019

Hopefully you're using some form of version control today. Git is the de facto winner of the version control ecosystem, so it's a valuable tool to know how to use at least at a basic level (branching, merging).

graycat · on Jan 8, 2019

Yup, thanks.

Yes, I have a simple version control system. I have some simple tools based on macros for my favorite editor KEdit and some scripts based on Rexx. So far they have been fine.

The biggest problem I had was documentation for .NET. I found, read, downloaded, and abstracted 4000+ Web pages from Microsoft's MSDN and have about another 1000 Web pages from elsewhere. These are in four batches -- the languages, especially Visual Basic .NET, for SQL Server, for communications including TCP/IP and ASP.NET (for the Web pages), and the rest for Windows in general. I also have documentation of my own, mostly just text but some with the math from Knuth's TeX.

So, in the code and other documentation, I put links to relevant documentation. The links in the code look to the compiler like comments. I insert and use the links with editor macros. In the end, I can display a page of documentation with one keystroke to my editor. Works well enough.

For help with versions, I have a very heavily used editor macro that inserts a time-date stamp or two of them delimited with BEGIN and END as comment blocks with time-date stamps.

Sure, if I hire some people, I'll have to have something better. But in our AI project at IBM's Watson lab, we did quite a lot of software development, including shipping some IBM Program Product code, with less in versioning tools than I'm using.

Right along I've been thinking that if my startup works, then one of the first steps up will be some audio-video facilities, a lecture hall, good audio and video recording and then editing, etc. Then we will call in experts on various topics, have them lecture, maybe sell their book or some such, record it all, and have it on-line for everyone as documentation. So, we'll have people who worked out good means of versioning, backup and recovery, archiving, data security, disaster protection, network security, server monitoring, server construction and software installation, SQL Server performance, LAN performance, code testing and reviews, developer training, the servers, racks, LAN, electric power, backup power, HVAC, real estate for floor space, telephones, e-mail, etc. Yup, I'll need a COO!

All these topics have been done well lots of times in the Fortune 100 and more, and lots of people have been there, done that, and still have the T-shirt. Likely don't necessarily have to hire them but can fly them in for 1-3 days and capture what they have to say. Later in the growth may have to do some things that are original, but can have a lot of growth before that.

anonytrary · on Jan 9, 2019

You're rolling your own version control for a 100k LOC project? You could easily pick up git in a day or two. Why not just use git? It represents your project as a tree; you can work on different branches of your project simultaneously, save progress on branches, merge them, etc.

graycat · on Jan 10, 2019

So far, the candle is not worth the match. I can and do easily accomplish plenty well enough the functionality you mention now for just one user, me -- I'm a sole, solo founder. And there is the intellectual property security issue, for a code repository or the cloud, and so far I don't use or really need either. I have lots of good uses for my time and have to allocate carefully.

Similarly, while I'm developing on Windows with no use of Linux at all, I make no use of Visual Studio; I greatly prefer my favorite text editor KEdit and its macro language Kexx, essentially Rexx.

I'm not saying that others should do what I'm doing. By far my favorite tools are KEdit and Rexx. Next comes D. Knuth's TeX.

I accept that if I had 50+ software developers then they would likely be heavy users of Visual Studio and GitHub. But for me, for now, they aren't worth the botheration.

Botheration, mud wrestling with software, has, after poor documentation, been my biggest obstacle.

ALL the work unique to my project has been fast, fun, and easy, the idea, applied math, code, etc. Thus I have a real sore spot about new, external tools -- my experience with such external things has been botheration I call mud wrestling. The time wasted has not been hours, days, or weeks but far beyond that.

IMHO the biggest bottleneck now in the future of computing is poor documentation. Second is mud wrestling with new products including tools.

The code is 100 K lines of typing but about 25,000 programming language statements, that is, 25 KLOC. That is, there are a LOT of comments and links to documentation in the code.

anonytrary · on Jan 8, 2019

Out of curiosity -- what is your startup? It sounds interesting, and 100,000 lines seems like a hefty chunk.

graycat · on Jan 8, 2019

I intend to announce an alpha test here on HN.

Some people at Battelle, etc. worked in information retrieval decades ago. A friend of my father's was involved. Those people concluded that key words could cover only part of search. My rough guess is 1/3rd. I'm going for the other 2/3rds.

The computing is pretty simple and fast. Given enough documentation, the code was easy to write.

The crucial core of the work is some applied math I derived based on some advanced pure math prerequisites. I got most of the prerequisites in my applied math Ph.D. So, to me, the project is mostly applied math with some cute data manipulations and a simple Web site user interface.

Maybe for a computer science audience, my work should do well giving people content with the meaning they have in mind. I know; I know: Writing code for much of anything having to do with meaning has been a challenge. Well, I derived some math.

Tonight the work is not very exciting: I'm writing some Rexx code making crucial use of some cute Rexx functions to correct the time-date stamps on the directories written by Microsoft's XCOPY. The time-date stamps written are usually not those of the source directories and, instead, are just the time, date when XCOPY created the copy. The files XCOPY writes DO retain the time-date stamps of the source.

For the correction, my idea is to set the time-date stamps of each directory to the newest time-date stamp of the files/directories in that directory.

Important is how the Windows NTFS file system does time-date inheritance: If create or change a file or create a directory, then the time-date stamp on the parent directory will within a few seconds be changed to time-date stamp on the new or changed file or the new directory. If the new directory has a file created in it, then, sure, that directory will have it's time-date stamp changed, BUT, surprising or not, the parent we mentioned will not have its time-date changed. I'm typing quickly; if this is unclear, complain and I'll be more clear.

This information is important: If the time-date inheritance kept feeding up the directory tree to the root, then the corrections I want would have to work essentially from the leaves of the directory tree up and in a 'breath first' way, with some tricky accumulation at each 'level' (distance from the roof) of some maximum time-date values, to the root. But with the way the inheritance actually does work, I can correct the time-date stamps as I mentioned on the directories in any order.

Not very exciting but has to be done.

anonytrary · on Jan 8, 2019

It sounds pretty clear to me, albeit your explanation is long winded. You're saying that, in the FS you're using, the time-stamp changes are "local" in the sense that they don't bubble up the tree after N = 1 levels.

In elevator-pitch terms, you're working on a search engine that takes into account user intent more accurately? Is that more or less correct?

graycat · on Jan 8, 2019

Thinking a little more, if the time-stamp changes did bubble up, then just start making changes at the leaves of the tree; at each leaf, work back to the root on the unique path to the root; and at each directory make a change in the time-stamp of that directory if and only if the new time-date stamp would be more recent than the old one; and, if don't make a change at that directory, then are DONE on that path. Revision, to make this work, first set all the directory time-date stamps to, say, year 1776. This solves the problem that the directories closer to the root may be on the path of several leaves and get changed several times. To identify the leaves, that is, directories with no subdirectories, get the directory tree names and just sort them into ascending order. Then a leaf directory and all the directories on the path from that leaf back to the root will sort together with the leaf directory the last one of those that sort together. That is, a directory is a leaf if and only if it is not on the path to the root (not an initial substring, using whole directory names) of the next directory in the sort. In particular, the last directory in the sort is necessarily a leaf. Ah, an algorithm!

I don't want now to further characterize or change the wording on what the startup is doing. My best word is the one I used, "meaning".

anonytrary · on Jan 8, 2019

How exactly will you know a user's "meaning" when they are asking for content? Do you intend to read their minds, somehow? Or perhaps you might hook them up to a heart rate monitor, track their eye movements and use ML to obtain user sentiment without requiring them to provide click-based feedback (like likes/dislikes).

graycat · on Jan 9, 2019

That doing well with meaning is a bit amazing as it is.

Saying "exactly" how is a bit much for a blog post. Besides the key is some deep pure and applied math with no way to explain those. Even if I gave an explanation, say, from my math derivations typed into D. Knuth's TeX math word whacking, nearly no one in the Sand Hill Road culture has the math prerequisites to understand it. Only a small fraction of those could do the original work I did. Even if they did do that work, no one on Sand Hill Road would have any interest at all.

I wasted MONTHS jerking the chains of the firms on Sand Hill Road, and all I got were laughs or silence. I explained as here the opportunity for the other 2/3rds of search but no one cared. I never got even to first base with Ycombinator.

One lesson is that Sand Hill Road just will not, Not, NOT do technical due diligence on original technology.

Right: They want a big market. My work stands to be of high interest to nearly everyone in the world with any access to the Internet, smartphone to high end work station. My Web pages will look just fine on nearly any smartphone. Big enough market?

World class research university pure/applied math Ph.D.? No interest. Long background in much of the best in computing at IBM's Watson lab? No interest. Running code ready for production? No interest.

Another lesson is that Sand Hill Road will be interested when I have a few servers busy, revenue significant and growing rapidly. Then they will offer me a term sheet where I go from owning 100% to owning 0% with a vesting schedule with some chance of getting back to maiybe 40% ownership if I don't get fired except it is in the fiduciary interest and responsibility of the board member investor to FIRE me so that I don't get my stock vested.

The US DoD, NSF, NIH, NASA, DoE, etc. WILL do careful evaluations of technical material; Sand Hill Road just will NOT do that.

I also have to ask if I want to report to a BoD with people who have shown with their feet locked deep in reinforced concrete and their eyes and ears totally shut: They can't play a productive role in my work now, and I can't believe they would be able to play a productive role later as the technology improves.

One of the least pleasant ways to spend an hour is to listen to a math lecture when don't understand anything said. I can see clearly: I give a presentation to the BoD about a small, new direction or initiative for the business, based on some applied math, complete with theorems and proofs, one of the best sources of credibility, and several of the BoD members get physically ill and rush to the restrooms. Then the BoD leaves, convenes in a local bar, has a dozen rounds of drinks, and votes me out of the CEO slot.

Those Sand Hill Road people just don't belong on the bridge of my ship; they are NOT qualified; they are a severe threat to the ship.

From some Mary Meeker (KPCB firm) data and some of my software timings, if I can get traffic enough to half fill my first server, then I'll have ballpark $250,000 a month in revenue with essentially all of that pre-tax earnings. At that time, no way will I accept a term sheet; not a chance. Instead I'll expand into three spare bedrooms, put in some good window A/C, get more electric power, install UPS boxes, get a backup generator in a hut out back, and grow to a few $million a month. Then I'll lease or buy some space enough to grow significantly more, hire, and plan for a major organization and server farm. Ah, it's just 2/3rds of search for nearly everyone in the world.

And no way will Sand Hill Road fund anyone to compete with me.

I've funded all of this from my own checkbook and am 100% owner. Some external funding would have helped, but now it's too late for that and for Sand Hill Road and Ycombinator.

It's really a math project; given the math, the rest is nearly all just a lot of routine typing although I did cook up a few maybe new computer science style algorithms, programmed them, and am using them in the code.

When I have alpha and beta tests, you will be able to see more although it will look like magic.

I can't keep people from calling it AI/ML, but I call it applied math.

anonytrary · on Jan 9, 2019

That's not an elevator pitch, you just used my question as an opportunity to fume about toxic Silicon Valley culture.

I can't deny, you gave me a few good laughs:

> Then they will offer me a term sheet where I go from owning 100% to owning 0% with a vesting schedule with some chance of getting back to maiybe 40% ownership if I don't get fired except it is in the fiduciary interest and responsibility of the board member investor to FIRE me so that I don't get my stock vested.

> I give a presentation to the BoD about a small, new direction or initiative for the business, based on some applied math, complete with theorems and proofs, one of the best sources of credibility, and several of the BoD members get physically ill and rush to the restrooms. Then the BoD leaves, convenes in a local bar, has a dozen rounds of drinks, and votes me out of the CEO slot.

> Those Sand Hill Road people just don't belong on the bridge of my ship; they are NOT qualified; they are a severe threat to the ship.

I honestly can't tell if you're trolling or not. Are you yanking my chain?

graycat · on Jan 10, 2019

Not jerking your chain or trolling at all.

I know; it's possible to set up a corporation so that I have investors but control all the voting stock so that, then, really, the BoD can't fire me. But doing that would be a constant fight; I would be inviting into my company some potential enemies and would spend a huge fraction of my time, money, and energy fighting them. They would be there for themselves and trying to take from me. Easy solution: Don't invite them into the company. They don't want me now, and when they do I won't want them.

I'm going to stay a sole, solo founder.

Some people in business have been successful doing that. I know some, and one of them, whom I don't know, is sitting in the White House now. Indeed, from the time I spent in yacht clubs, nearly all the members did that -- sole, solo founders, a family company, private company, no VC/PE or other outside investors, no public corporation.

My point about how much people hate math lectures with theorems and proofs is exactly correct.

Then the scenario of the BoD rushing to the toilets and convening in a bar, getting drunk, and firing me is only a slight exaggeration for what would likely happen.

E.g., I'd be like that guy in Scion Capital in the book and movie The Big Short and Billy Beane in the book and movie Money Ball. They were both right and for the right reasons, but they nearly got shot down by skeptics. Indeed, the skeptics or their skepticism was the cause of much of the opportunities.

My views of Sand Hill Road versus DoD, NSF, NIH, NASA, DoE is literally, rock solidly true. DoD, etc. will invest in some applied math and mathematical physics, e.g., the US Navy's version of GPS (I worked in that group for a while), and Sand Hill Road just will not; they won't even consider original, powerful, valuable, advanced applied math. I have hundreds of e-mail messages proving that.

You seem to have wanted an "elevator pitch". Okay:

As has been clear going way back in information retrieval, key words do well in only about 1/3rd of the content on the Internet, searches people want to do, and results they want to find. I'm going for the other 2/3rds. Sufficient for this, I've derived some original applied math based on some advanced, pure math prerequisites, written the production quality code, and am rushing to go live. The math makes powerful progress on the challenging problem of the meaning of the content. The work stands to be of intense interest to nearly everyone in the world with access to the Internet via any device from a smartphone up to a high end workstation. First I will target users in the US and then Europe and then target the rest of the world as there are revenue opportunities.

That's just what I've done and am doing. Nothing could be more simple or true.

The main issue is the math, but the set of people in a position to understand the math and with confidence to see its power is really tiny. Even among the best trained Ph.D. mathematicians, the prerequisites are not well known.

From my Ph.D. work, but NOT from my time at IBM's Watson lab, I know some people with the prerequisites. When I did original work building on those prerequisites, those people liked my work right away. When I went to publish, my work was published right away. Now should I explain my original work, those people would think for a few days and then say something like "Nice. Right, that should work.". I was confident in my research then, and I'm equally confident now.

Much of the confidence comes from theorems and proofs -- I've always liked those because they have saved my skin many times since no math teacher or prof has ever been able to find anything wrong with my correct proofs, and after I had started to learn the material, my proofs were correct. Maybe the teachers/profs hated me; it was clear that some of them did; maybe they wanted to laugh at me, and too often did, like you do and like the people in those two movies did; but they can't find anything wrong with my proofs. If I were not confident, then I wouldn't be doing this.

I got into math and science partly because in high school the teachers had to give me A's in math and science. In English and history, the best I could do was get social promotion. In college I saw that math and science looked far and away like the best subjects for a good career. Early in my career, in applied math and computing on national security problems near DC, I saw more value in math and did a lot of independent study of both pure and applied math. Then I got a Ph.D. in applied math with a lot in pure math -- some of my publications can be regarded as both pure and applied.

For competition, just for the prerequisites, people would have to study and learn brilliant work in math going back 200+ years or reinvent that material. The studying is too hard for all but a tiny fraction of the population and reinventing is much harder, essentially impossible. The people Sand Hill Road likes to back haven't studied the material, won't, and certainly won't reinvent it. Then they'd be faced with my original work. Nope: I don't expect competition.

That's just the way the project is. Seems entirely reasonable to me. But you want to laugh at it. Hmm ....

anonytrary · on Jan 10, 2019

I wasn't really laughing at you, more so at your way of explaining those people you interacted with. You just had some funny prose -- which is hardly a bad thing.

So you'll scrape lots of web data, run it through your 'meaning algorithm' and then provide a search bar to search said data? So, like your idea is like Google, but it understands english sentences close to like a human would? Did I understand that correctly?

graycat · on Jan 10, 2019

> I wasn't really laughing at you, more so at your way of explaining those people you interacted with. You just had some funny prose -- which is hardly a bad thing.

Much of the prose was intended to be funny.

> So, like your idea is like Google, but it understands english sentences close to like a human would? Did I understand that correctly?

I'll just stay with the description I gave.

anonytrary · on Jan 10, 2019

You could work on a better description. The ability to understand a question without getting off topic and describing things eloquently, concisely and in layman terms are necessary skills.

graycat · on Jan 11, 2019

I gave short, concise, layman's terms, user's terms, on topic descriptions lots of places in this thread: Going for the rest of search, the 2/3rds currently served at best poorly, and based on the meaning of the content. I said all that.

I'm not pitching VCs here: As I made clear, I wasted months on VCs and gave up. The short, simple, ..., description of why is that (1) they won't evaluate and, thus, won't invest in technology and (2) they want to invest in traction significant and growing rapidly and by the time I have that I won't accept an equity check or a BoD. As I wrote, the VCs don't want me now, and when they want me I won't want, need, or accept them. Nice and short.

Want a shorter elevator pitch? Google and Bing are good in about 1/3rd of search. I'm going for the other 2/3rds.

Elevator pitches are for investors, and I spent months sending pitches just this short, with more below, longer, varied, etc. The conclusion was simple: The first requirement for any check is traction. Many VC Web sites were deceptive on this point; as a determined entrepreneur I kept trying; but I've learned my lesson -- no role for VCs.

graycat · on Jan 10, 2019

I was very much "on topic" for my startup.

It was not clear that you were actually asking a question or wanting an "elevator pitch".

For my "abilities", your statement is patronizing and an unjustified insult. My abilities are just fine, thank you.

Your guess at what my startup is doing is not good. What I am doing is much better than anything that could be or follow from your guess.

Again, once again, over again, yet again, one more time, the crucial core is some original applied math based on some advanced pure math prerequisites. It really is. That's the truth. The work is from some of my studies in my Ph.D. in pure/applied math and more studies and then my original work. This is just literally true, not hype. It's TRUE. That few startups are doing such things is not my problem but some of my opportunity. But one result is that the only people with even a shot at evaluating or understanding the core technology are well trained mathematicians. That's just true, and as such I have no better way to put it.

That the startup also uses some computing does not mean that Sand Hill Road or academic computer science is qualified to evaluate or understand the technology. More generally, Sand Hill Road and academic computer science just do not cover, even significantly cover, all the technology that can use computing and be valuable for a startup. In particular, there is applied math, and that is much more broad and deep than Sand Hill Road or academic computer science; such people just have no chance of understanding the work because they don't have the prerequisites and just will not reinvent them. I've been clear on this.

With some of the other news on HN now, I should insert, as I have in my more polished descriptions, that my search engine is to be "safe for work". It should also be safe for kids and be family-friendly.

Yes, I'm going for the other 2/3rds of search, but the whole of search I'm considering has no porn, etc.

Finding content based on meaning should be one of the best contributions to culture and civilization, and I hope the site does that.

And the site stands to have some of the best protections for user privacy on the Internet: E.g., the engine does not set, read, write, or use HTTP cookies. User Internet history is not collected or used: Two users who use the site the same way at essentially the same time will get the same results.

The site has nothing to do with natural language processing, the semantic Web, the interest graph, neural nets, etc.

In no significant way is the technology accessible to the Sand Hill Road community, no more than to some unknown tribe deep in the Amazon; they have the same on the prerequisites -- nothing.

anonytrary · on Jan 14, 2019

> Again, once again, over again, yet again, one more time, the crucial core is some original applied math based on some advanced pure math prerequisites.

This is super descriptive! I should have understood you the first time. My apologies.