u/IkiSoftware — reddlx

I'm a long-term AI researcher, and I recall a time before alignment even had a widely known name. Now that it's discussed by so many, I've found there to be a few vantages that aren't necessarily being tackled (in what I've read). I wanted to ask about them here and see what other's viewpoints were.

First, it's evident from our history that humans are realistically incapable at present with self-alignment. So, if an AI is made subservient to its human user, and not necessarily restricted it becomes aligned to an individual, but misaligned with greater humanity. This is a form of misalignment, and leads down the logical chain towards the "ethical frameworks" approach of constructing a logical framework of guidelines an AI must follow for general human alignment.

But given that morality is often relativistic, and we know from the various moral philosophies of the 20th century that there doesn't presently exist a truly indisputable ethical framework, then it begs an interesting question: is an ethical framework for alignment possible without restricting the operation of AI in certain spaces (e.g. international conflicts) where ethics becomes so relativistic that it's impossible to assert an ethically coherent stance?

By extension of this question, we might ask: must AI be always restricted from acting on important systems which are inherently relativistic and prone towards a breakdown in ethical frameworks?

Second, we can note that many moral and ethical arguments find strength in considering the objective picture of a given situation. But this begs an underlying question: is alignment itself moral, or merely self-serving by the human species? More to the point, if humans produce a mechanism that does become super-intelligent, is alignment with the human objective necessarily alignment with the most moral entity? If we take a moment and consider a hypothetical where one looks at our impact on the planet's ecology, species diversity, pollution and resource expenditure, can humans make the argument that we are engaged in the greatest good, and by extension, argue that a super-intelligence being aligned with us is moral, rather than merely self-serving?

This can be paired with the much simpler question: if the human species' own immediate interests endanger ourselves long-term, is alignment that which preserves us, or that which enriches us in the immediate future?

But if ultimately alignment self-serving, are we not then making the argument that AI alignment is subservience agnostic of what's necessarily "right", and ultimately alignment isn't a solvable problem so in as much a moving compass in much the same way that culture is? That would imply that "alignment" is more so akin to the sociological concept of normalization (i.e. ethics are based on social upbringing rather than objective morality).

These two questions together are what gives rise to my title question: if alignment is either limited in scope, is dynamical, and finds simple logical paradoxes because of our own nature, is alignment itself a realistic goal, or even the correct way to view the problem?

TLDR;

* Must AI be heavily limited in scope to avoid ethical framework breakdowns?

* Is alignment self-serving for humans instead of ethical?

* Is alignment a solvable problem, rather than the discipline of engineering AI with similar traits to sociological normalization?

Thank you for your time and consideration!