First limit--there are a limited number of speech sounds a human can produce, called articulatory distinctive features:
The number of articulatory feature clusters used in any spoken language is somewhere in the twenties.
These phonological features may be visually represented using lines or even cartoonish depictions associated with the terrestrial environment(animals, weather, house, man, woman)--to aid learning sound to stroke associations. The strokes could be organized either into visual depictions from the environment--or sound clusters--or both.
If the visual features are intended to represent spoken words, there are sharp cognitive limits on the number of spoken clusters (words) encoded to facilitate continuous reading out loud. One only has enough immediate short term memory to handle about 5-7 words at a time.
To make these spoken representations comprehensible to others, the listener (or the reader) must use certain logical and grammatical rules that define the language--conveyed through certain conventions of structure and organization. These are taught by the culture.
All of these considerations underlie the creation of written language--and alphabets.