Here is a list of open problems. Come up with an original solution for any of these, and get an automatic 100 for the course!

Note that *some* problems listed here include an update
indicating that they are **no longer open**.
Naturally, solving any of *those* problems will not suffice.

Let P_{2} = { *x* ∈ {0,1}^{*} : *x* is primitive }.
It is not hard, and makes a good exercise, to prove that language
P_{2} is not regular. However, the following problem is open:
**Is P _{2} context-free?**

Most people believe the answer is no, but no one has a proof. This
very hard problem
has been open for about 30 years. (There's even a 500-page book, published
in 2014,
about this problem,
*Context-free languages and primitive words*, by Dömösi
and Ito.)
By contrast, it is possible to prove that the language of non-squares
*is* context-free. This also makes a good exercise!

**Can Q
be expressed in terms of known constants? Is it irrational?
Transcendental?**

(a) Does every squarefree finite word appear as a subword (factor) of
**x**?

(b) Do the letters occur with limiting frequencies in **x**, and if so,
what are the frequencies? (Limiting frequencies are defined below in problem 10.)

*Update, March 22 2012*: solved! by Slovak computer scientist
Galina Jirásková. The bound is
2^{Θ(n log n)}.

(It is known that it is possible to avoid *three* consecutive
blocks of the same length and same sum.
See this paper.)

Prove or disprove that the limiting frequency of the symbol 1 in
*K* exists and is equal to 1/2. The definition of limiting frequency is
the limit
lim_{n → ∞} | *K*[1..n] |_{1}/*n*.

Is there a significantly faster algorithm? For example, can we
find all the strings accepted by an
*n*-state unary acyclic NFA in *O*(*n*^{2}) time?

*Solution!* (sort of) by Aaron Potechin, February 2017:
he shows that the problem
of determining whether a graph on *n* vertices has a triangle
reduces to this problem. Since no algorithm faster than
*n*^{e} is known for the triangle problem, and it is
well-studied, any solution for the NFA problem would improve the
running time for a very well-studied problem.

*Tentatively solved!* Curtis Bright has found that for
i = 8005, the number is a "probable prime". That is to say, it
passed about 12 Miller-Rabin tests. It remains to complete a formal
proof for this exponent.

There is an old manuscript of Sándor Horváth that claims to solve this problem, but to my knowledge it has never been published, and there seem to be gaps in the proof.

New! There is another claimed solution by Pálvölgyi Dömötör, posted here. I have not been able to understand it, and I am not completely convinced it is correct. If you read the proof and find an error, please let me know (you will get some extra marks). If you read the proof and are convinced by it, write it down in your own words, again for extra marks.

In our example, we get 33232, then 332322, then 3323222, then 33232223, then 332322231.

Conjecture: this process always terminates with 1. For more information, see the paper here.

Is there a positive integer *n* ≥ 2 and nonempty words
*u*_{1},
*u*_{2}, ...,
*u*_{n}
such that both of the following equalities hold:

(*u*_{1} *u*_{2} ... *u*_{n})^{2} = *u*_{1}^{2}
*u*_{2}^{2} ...
*u*_{n}^{2}

(*u*_{1} *u*_{2} ... *u*_{n})^{3} = *u*_{1}^{3}
*u*_{2}^{3} ...
*u*_{n}^{3}

and furthermore for at least one pair of indices
*i, j* we have *u*_{i} *u*_{j} ≠
*u*_{j} *u*_{i}?

*Update*, April 2017: The problem has been
solved (negatively) by Aleksi Saarela.

If the "does there exist a pair" is replaced by "is it true that for all pairs", then the answer is known to be decidable.

Is π or e or log 2 or γ (Euler's constant)
(*b*, *k*)-automatic for some (*b*, *k*)? Probably
not, but nobody currently knows how to prove it.

There is a more general question here, which I mentioned in class: suppose you apply a finite-state transducer to **t**
and you get something, say **x**, that is not ultimately periodic.
Can you always apply a finite-state transducer to **x** to get **t**
back again? Nobody knows.

What is its complexity? I can prove it is PSPACE-hard, but is it in PSPACE? I offer $100 for this problem.

*Update, 2017*: Solved! By
Marek Szykula and Pawel Gawrychowski.

Same question for bases 9 and 10. For base 10 this is sequence A130448 in the Online Encyclopedia of Integer Sequences.

*Update:* Partially solved in Fall 2018 by Joakim Blikstad; he
proved that this LCS has length 2^{n}(1-o(1)).