Theresa D’Entremont, Salem State University
Abstract
We all know that the budget for an institution’s Writing Center is not the same as the budget for a Division 1 Football team. Recently, the other tutors and I at the Mary G. Walsh Writing Center at Salem State University got an email that we always dread: “We don’t have the budget for all of the tutors that we have put into the schedule. Can anyone lower the amount of hours they asked to work this semester?” Because the Writing Center does not have the biggest budget in the institution, it is always a struggle to sustain operations budgetarily. Those in charge of the Writing Center spend a lot of their time trying to get more money from administration and writing grants just to keep the place running. Like a lot of industries, it is a worry that the administration will find AI to be cheaper than hiring human tutors. It will be much cheaper just to have one chatbot to service many students and have the IT department keep it running. It is true that AI can be used to tutor people today. However, the AI takeover will not be anytime soon (Khan, 2024). Nevertheless, that fear has already made it to those in the Writing Center. Just because it is not going to happen this semester or next semester, does not mean that it will never happen. At some point it will be cheaper to get a chatbot up and running instead of paying all of the student tutors (Svanberg et. al., 2024). This is something that a lot of industries will have to grapple with eventually, including writing centers.
Keywords: Writing Center, AI, Budgets, AI Takeover
Summary
Authors in a few of these articles are studying AI-generated writing feedback. They are determining whether or not AI feedback helps the students with their writing and how the students respond to that feedback. The results are mixed. Some results show that human feedback is better than AI feedback, while others say that feedback from both AI and human feedback is best. The consensus seems to be that peer feedback can be given at the same time as AI feedback for the best results for the student.
There are ethical issues with using AI for written work. There is a fear of plagiarism and that students will overly rely on AI without doing any work on their own. There are many ethical ways that students can use AI for help during different parts of the writing process. To encourage students to use AI ethically instead of students using it to write their full papers, there needs to be an emphasis on original voice. Encouraging students to let the paper speak for them rather than focusing on grades could further help students want to write their own paper.
While there is a fear that AI could eventually take human jobs, studies show that this will not be happening in the short to medium term. The fear will be realized once AI becomes more cost effective than employing humans. The cost and the effectiveness of AI are not at the level where there will be massive layoffs. However, they are all open to the idea that this may happen in the future as AI develops.
Introduction–Will AI Destroy the Writing Center?
As one of the Graduate Assistants at the Mary G. Walsh Writing Center at Salem State University, I was tasked with looking at how AI could be utilized in the writing center. When the coordinator brought that up during a staff meeting, there was some backlash from the other tutors. Like many, the consensus was that AI systems are there to take our jobs and if we use them, we are making it easier for the University to lay off student employees for a computer system. However, as Svanberg et. al. (2024) noted, it is prohibitively expensive for the university to make an AI tutoring system. The tutors of today seem to be safe from the dreaded AI layoffs.
That does not mean students are not opting to use AI to generate their papers instead of going to the writing center. There is always the concern about plagiarism with AI. The Kansas University Center for Teaching Excellence (2024) reminds us, “Students have long been able to buy papers, copy from online sources, or have others write for them.” There will always be students who want to take the short cut. There will always be students who do not want to do the work required in writing a paper. There is only so much that we can do at the writing center to meet these people and change their mind about using their own voice. There is also the idea that tutors can help facilitate the students’ use of AI, but that would mean extra training to help tutors become mini-experts in AI systems.
What Can AI Do?
There are many studies examining the use of AI to give feedback on essays. The Kansas University Center for Teaching Excellence (2024) has a whole page about where students can use AI to ethically enhance their papers. This can happen at any stage of the writing process. They can start by generating ideas or narrowing down a thesis. It is still risky to use AI in the research part of an essay, as AI is still prone to making mistakes. Deeper in the writing process, AI can help writers create drafts, titles, or section headers, help with transitions and endings, getting feedback on details and on drafts. Dani Lester (2024) mentions how AI is good at making sure that the writer is following the rubric. It is also touted to help the student find “clean” or “error free” prose. Bassett (2022) talks about how Large Language Models are not writing in the same sense that a human writes but guesses the next best word in completing the text.
Is It Better to Use AI or a Human Tutor?
When a student has a choice of going to the writing center to see a human tutor or putting their paper into AI, which choice should they make? It seems that human tutors are still better than their computer counterparts. Students responding to Maphoto et. al. (2024) expressed skepticism that AI would be able to help them with their writing skills. Their AI was seen as a better supplementary tool rather than a tool all by itself. In Seyyed et. al. (2024) peer feedback was more effective than AI feedback. Peer feedback was of higher quality compared to ChatGPT’s feedback. This difference was primarily due to the peer’s superior ability to describe and identify issues in the essays. While ChatGPT provided more extensive descriptive feedback, including summaries and general descriptions, student tutors excelled in pinpointing and identifying specific problems in their feedback. Students’ feedback is superior in identifying specific issues and areas for improvement. This advantage is likely due to students’ cognitive skills, critical thinking, and contextual understanding, which enable them to spot problems that ChatGPT might miss.
However, feedback from ChatGPT includes all essential components of high-quality feedback: effective, cognitive, and constructive. Seyyed et. al. (2024) also concluded that feedback from AI could complement human feedback. Tu and Hwang (2023) showed that students were primarily using AI for remembering, understanding, and applying. Dani Lester used AI in her writing center to analyze Siegfried Sassoon’s poem “Glory of Women.” AI came up with a literary analysis of the poem. It was unable to see the irony that a human would be able to see. However, since she put in the Professor’s rubric, this meant that the poem analysis might have got a passing grade. AI is able to produce error free and rubric compliant text. However, there are still areas in which it is weak, like spotting irony. This adds to an AI weakness, hallucinations. An AI hallucination is when it confidently makes things up when it does not have enough information (KU, 2024). AI is also prone to structural biases (Pazzanese, 2020). So, while AI feedback is not a bad thing, at the moment it is better to have some human feedback. Human feedback can be complemented by AI feedback, but AI feedback is not a replacement for human feedback yet.
Is AI Going to Take Our Jobs?
The truth is we do not know what will happen in the long term. Technology evolves so fast. Sometimes it is hard to keep up. Think about how ChatGPT came out of nowhere and now we are worried that it is going to replace us. This fear is felt within a lot of industries. We have already heard of AI layoffs. We wonder what job is next on the chopping block. It is especially scary in writing centers that do not get a lot of funding from their institution in the first place. However, Khan (2024) states that AI layoffs are a misnomer. The companies are investing in AI, yes, but that was not the reason for their layoffs. These companies are using AI to cover for mismanagement and bad business practices. AI technology is not at the place where there can be mass layoffs. Svanberg et. al. (2024) went further. They have looked at the cost of making an AI to take on automated tasks. It is still not cost effective enough for companies to make an AI than it is for them to pay employees, at least in the short to medium term. Neither say that massive AI layoffs will not happen in the long term, but as for the near future our jobs are safe. This means that while we have to worry about institutional budget cuts, AI will not be cheap enough for institutions to get rid of writing centers completely.
What Can We Do?
It seems so futile when we look to a future where universities see the writing center as more of a cost burden than using AI. We look at how technology changed or got rid of many jobs in the past. Looking forward, it is easy to get depressed at what AI will do to jobs that are essential to writing. It seems inevitable that AI will overtake us when it comes to writing and what writing will mean for the future.
Tracy Mitrano (2023) suggests that in education we must strive to make sure that there are educational methods fostering genuine learning and innovation. Donald Clark (2024) suggests we take a look at the work of Jaques Derrida and ponder what writing is and what meaning we take from writing. Dani Lester (2024) suggests that we look to empower writers to take ownership of their writing. Making sure that we honor diverse stories, voices and linguistic variation. They suggest that we partner with the writing departments at the university and push for students to write authentic pieces that may vary from standard American English but are more human because of it. It is going to be essential that the writing center and the writing professors work together to make sure writing stays human. If writing stays human, then human tutors will be needed more than AI tutors in the future. Deans et. al (2024) showed ways that tutors at UConn utilize AI during tutoring sessions. The tutor at UConn would not have AI write for the student, but come up with different options and encourage students to use those options to create their own writing.
Can we save the writing center from AI technology? I am unsure. I do know that student tutors, professors and the staff at the writing center need to be aware of what AI can do, and how students are responding to it. We need to make sure that the administration knows that writing is and should stay a human activity. Writing centers need to be aware how much they are using AI tools and what they are using them for. The writing center itself will need to adapt to AI technology. If we all work together, we can make sure that writing centers do not disappear in the future taken over by AI.
Annotated Bibliography
Bassett, C. (2022). The Construct Editor: Tweaking with Jane, Writing with Ted, Editing with an AI? Textual Cultures, 15(1), 155–160. https://www.jstor.org/stable/48687521
Academic journal articles have become exercises in translation, as human scholars grapple with machine-level thinking that often eludes their full understanding. Raymond Queneau, literary programmer and OuLiPo member, claimed that only a machine could read a sonnet by another machine. What happens to humanities research? Humanities research demands decisions based on multiple materials, on assessments of provenance, on copies and originals, on texts that are partly lost in time, or over time. What tasks would remain for textualists, close readers, scholarly editors, and traditional translators if AI could efficiently collect, read, sort, organize, assess, compare, annotate, and even translate human materials? Can humans judge the output that AI brings including pattern recognition and logical calculation?
There are recent and rapid advances in machine learning and language processing. The capacity of AIs to deal with natural language has expanded. Beatrize Fazi views this as a reversal: what initially started as an effort to create programming languages that could translate human operations into machine-readable code has now shifted to translating what emerges from the machine. There are debates about machine learning and how to deal ethically with the outcomes. When looking at the field of scholarly editing, we need to look at what a fully machine-determined edition might look like. Would it be different from a human produced edition? How would we be weighed and judged? It also involves reflecting on how AI-driven productions might differ from those already created through existing human-machine collaboration. Humanities scholars need to use argumentation and citation to show their editorial work progress. If AI editors can make better-than-good guesses—perhaps even the best guesses—about what “should” come next in a text, how it might be completed, or what it should “be,” are AIs then challenging the role of human scholarly editors by suggesting ways to fully automate and potentially render them obsolete? Recalling the deconstructionism of Jacques Derrida, if AI can write for us or even better than us, what is writing? Can we say that writing is a fully human activity or not? If AI does not make meaning from the words it is writing, what meaning can we take from them?
This is not arguing that AI processors are becoming “like human,” even if AIs are dealing with language in more advanced ways. The AI Editor possesses agency and the ability to make decisions (if/then…). It replaces the cognitive labor that defines the role of the editor. In the future, such an editor might be capable of analyzing a work, identifying its nuances, characterizing its style within a genre, comparing versions, detecting fakes, exposing inaccuracies, finding new connections, and making judgments. This vision suggests that the AI becomes the scholar, while the human scholar is rendered redundant. Why stop at automating the scholarly editor when technology could, in effect, resurrect the writer “themselves” and compel them to speak? Could a future “Jane Austen” be tasked with editing her own work, bypassing the once-essential scholarly editors entirely? The paradoxical promise here—paradoxical given the artificiality involved—is a form of unmediated contact that, due to this perceived collapse of distance, might be seen as authoritative.
The language AIs analyze is not mathematics or life itself, but human discourse. GPT models work with giant data sets and then are tweaked on smaller data sets. GPT models become a trained “authority” on a period or a genre. ML algorithms inherently learn from something human from the outset; these intelligent agents, though seemingly autonomous and often claimed to be so, are never entirely free from human influence. As a result, AIs are also never free from bias, which can become algorithmically amplified on social media, a phenomenon now widely recognized. This may be an inherent flaw in humanity, but it still does not mean that AI will not represent bias. In a simplistic view, computer science says that AIs create nothing, they just simulate. AIs generate fake news, simulate human agents, or, on a different level, simulate language itself. The words AIs produce seem to make sense and appear to be guided by a logic related to meaning-making, even though they are not. To bring clarity to this issue, we first need to acknowledge the increasing agency of AI-based algorithms, which will create editors that differ from their predecessors. Secondly, we must recognize that this agency does not equate to autonomy, nor does their ability to produce content make them creative. AIs co-produce. AIs are involved in new types of partnerships with human scholars, though these partnerships do not guarantee that each partner will fully understand the other.
Clark, D. (2024, August 9). Does Derrida’s View of Language help us understand Generative AI? Donald Clark Plan B. https://donaldclarkplanb.blogspot.com/2024/08/does-derridas-view-of-language-help-us.h
Jacques Derrida, a French philosopher known for his work on deconstruction, influenced fields like literary theory, cultural studies, and philosophy, teaching at institutions like Yale and UC Irvine. While his ideas were shaped by J.L. Austin’s philosophy of language, Derrida’s focus shifted to the textual analysis seen in works like Of Grammatology and Writing and Difference (1967). His concepts—deconstruction, différance, and the fluidity of meaning—parallel the way Large Language Models (LLMs) operate, emphasizing the dislocation of meaning from authorship and the dynamic interpretation of language. Jacques Derrida’s critique of the Metaphysics of Presence targets the philosophical pursuit of fixed, unchanging truths beneath appearances. His deconstruction of texts undermines structuralism, challenging assumptions about the objectivity of science, reality, and meaning. Derrida’s work, aligned with Critical Theory, dismantles grand narratives and offers a nuanced view of language and interpretation. His ideas resonate with Generative AI, particularly LLMs, which generate text detached from origins, emphasizing interpretation over absolute truth, mirroring his vision of meaning’s fluidity.
Jacques Derrida, like Heidegger, coined a range of philosophical terms (e.g., différance, intertextuality, trace) to critique Western thought, which he believed was overly focused on speech (phonocentrism). He argued that traditional philosophy prioritized speech over writing, assuming speech to be a more authentic expression of thought. Derrida elevated the written word, seeing it as distinct from its author, and emphasized how text liberates thought from fixed meanings. Generative AI, particularly LLMs, has revived this focus on text, allowing anyone with internet access to create, summarize, and manipulate text. Derrida would likely be fascinated by the development of LLMs and their creation, training, and deployment. Derrida believed that knowledge is constructed through language and texts, which are always open to interpretation and re-interpretation, rejecting totalizing narratives that offer final explanations. He advocated for dialogue, critical engagement, and the deconstruction of traditional educational structures. Teaching and learning, according to Derrida, should be a mutual, dialogical process that encourages questioning assumptions and exploring different viewpoints. He emphasized the role of language in shaping our understanding of reality, highlighting the fluidity of meaning and the interconnectedness of texts, central themes in his philosophy of deconstruction. Jacques Derrida is best known for developing deconstruction, a critical method that exposes and undermines the assumptions and oppositions shaping our thinking and language. He showed that texts and concepts are inherently unstable and open to multiple interpretations.
Through techniques like analyzing oppositions (e.g., presence/absence, speech/writing), Derrida argued that texts are not truth machines, but complex, ambiguous human activities—similar to how generative LLMs function. In Of Grammatology (1967), Derrida famously claimed there is nothing outside of the text, which at first seems nonsensical, but he meant that all understanding is mediated by language and culture. This challenges traditional hierarchies, like privileging speech over writing. For Derrida, meaning is never fixed; it arises from the relationships and differences between words within a given text and context, meaning there is no direct access to reality beyond interpretation. Derrida’s views on teaching and learning emphasize the fluid, uncertain, and dynamic nature of knowledge. His deconstructive approach encourages critical engagement, dialogue, and the ongoing questioning of assumptions. Education, in this model, is seen as an open-ended process where teachers facilitate interpretation and exploration, valuing ambiguity and complexity. Deconstruction plays a key role, urging students to critically examine texts and concepts, uncovering hidden assumptions and contradictions. This approach challenges traditional educational binaries, such as true vs. false or author vs. reader, and encourages students to question accepted knowledge, rather than passively absorbing it. Derrida argued that meaning in language is never fixed and is always “deferred,” evolving through context and the differences between words.
This dynamic, shifting nature of meaning is reflected in LLMs, which generate text based on patterns learned from large datasets. The meaning of LLM outputs depends on the input context and probabilistic associations, rather than being fixed or retrieved from a database. Like Derrida’s concept of différance, the meaning in LLM-generated text is always relational and contingent, never fully present or complete. Derrida’s concept of différance combines two ideas: to “defer” and to “differ.” It emphasizes that meaning is never final or fixed but is constantly constructed through differences, particularly oppositions between words. Meaning is always deferred in language, as words derive their meaning only in relation to others, creating an endless play of differences. This concept parallels how LLMs generate text. In LLMs, each word and sentence is produced based on its differences from and deferrals of other possibilities, with meaning emerging from these relational distinctions rather than being fixed in a single term.
Derrida’s concept of intertextuality highlights the interconnectedness of texts, where each text derives meaning from its relationship to other texts, forming an infinite web of meanings. No text can be understood in isolation; it is shaped by its references to others. This idea parallels the functioning of LLMs, like GPT-4 or Claude, which do not store text as explicit phrases but as abstracted connections through embeddings and neural network parameters. LLMs are trained on a vast corpora of texts, making their outputs inherently intertextual, reflecting the influences and relationships between countless other texts, much like Derrida’s vision of textual meaning. Derrida’s concept of a “Trace” suggests that every element of language and meaning carries remnants of other elements, which influence its meaning. This idea challenges the notion of pure presence or complete meaning. For example, the word —present— is understood in relation to its opposite, —absent—. Similarly, in LLMs, tokens serve as traces, holding mathematical connections to other text in the model’s training database. These traces reflect the interconnections between all elements of the text, shaping meaning through their relational context.
Derrida’s concept of Aporia refers to a state of puzzlement or doubt, where contradictory meanings coexist and make definitive interpretation impossible. In LLMs, a similar state occurs when the model “hallucinates” or struggles to resolve contradictions during a dialogue. The model may express uncertainty, sometimes apologizing for its mistakes, reflecting a moment of doubt or confusion about its interpretations. This mirrors Derrida’s idea of Aporia, where meaning is unsettled and ambiguous. Derrida’s concept of écriture (writing) and the “supplement” highlights how something secondary can be fundamental, adding to, completing, and displacing what came before. This is reflected in LLMs, where new text is generated from old text but replaces it entirely, with no direct copying or sampling. Writing, for Derrida, does not just record knowledge but creates and reshapes understanding, serving as a supplement to speech.
In education, this idea suggests that learning is an ongoing process of adding new perspectives and interpretations. Similarly, Generative AI acts as a supplement to writing, producing new forms like summaries, rewrites, or expansions, as part of an open and evolving process rather than a final product. Derrida challenged the idea of authorial intention, arguing that meaning emerges from the interaction between the text, readers, and other texts, not from the author’s intended meaning. Similarly, LLMs generate outputs based on statistical associations rather than deliberate intention or understanding. The meaning in LLM responses comes from patterns in the data used during training, not from any inherent authorial intent. This aligns with Derrida’s de-emphasis on the role of the author, as LLMs de-anchor text from its original authorial context. Derrida emphasized the playful nature of language and the multiplicity of meanings (polysemy) that words or texts can have. Similarly, LLMs exhibit this playfulness and multiplicity in their responses. A single input can produce various outputs based on slight contextual variations, demonstrating the models’ ability to handle and generate diverse forms of language.
The critique of Derrida centers around his deconstructionist approach, which focuses on texts and language to the exclusion of real-world context, leading to a denial of biological distinctions and objective reality. Critics argue that his theories, especially on “difference” and “deconstruction,” are obscure and often irrelevant to practical fields like education and learning, as they are primarily rooted in literary theory. Derrida’s refusal to define key concepts and his reliance on wordplay detract from his theoretical contributions. His impact on education is minimal, and his postmodern approach, rejecting Enlightenment values, is often seen as self-referential and disconnected from rational discourse. In the context of Generative AI, Derrida’s ideas are stretched to their limits, as modern AI is multimodal and can engage in speech, generate images, and create other forms of media, which goes beyond his focus on text alone. The parallels between Derrida’s theories and LLMs highlight a fascinating intersection of philosophy and technology, showing how Derrida’s ideas about language can be applied in the digital age. His deconstructionist view of language, which emphasizes the ambiguity of texts detached from reality and authorship, aligns with how LLMs generate context-dependent meaning. Both challenge the notion of fixed meaning, focusing instead on the fluid, dynamic, and interconnected nature of language. Derrida’s philosophical insights about the instability and contextuality of meaning resonate with how LLMs produce text probabilistically, emphasizing the complexity and dislocation of text from its origins.
Deans, T., Praver, N., & Solod, A. (2024, February 26). Ai in the writing center: Small steps and scenarios. Another Word. https://dept.writing.wisc.edu/blog/ai-wc/
The survey conducted among UConn students about AI usage revealed that only 8% used AI regularly, 12% occasionally, 30% had tried it a few times, and about half had never used it. Despite this, over 70% of students expressed interest in learning how to use AI for academic purposes, even though more than 60% felt unlikely to use it during the academic year. AI systems, like ChatGPT, are trained on public data (e.g., Wikipedia, forums, academic papers) and generate probabilistic responses without true understanding. This can lead to issues such as “hallucinations” (believably incorrect information) and lack of originality. Prompt engineering, which involves strategically crafting inputs, can help shape outputs effectively. Prompt engineering is the process of carefully crafting input prompts to influence the output generated by a language model (LLM). Key strategies include:
-
- Assigning an identity to the model (e.g., treating it as an expert in a specific field).
- Being specific when giving AI instructions.
- Guiding the model step-by-step through the task.
- Refining results iteratively to improve the final output.
This method helps users maximize the effectiveness and accuracy of LLM responses. Concerns about AI include academic integrity, privacy (data submitted may be reused without permission), and bias against non-native English writers. However, AI tools can serve as powerful aids in academic settings. Tutors and students have used ChatGPT for tasks like generating thesis statements, brainstorming arguments, refining sentences, and overcoming writer’s block. While AI outputs often require refinement, they can spark ideas and facilitate academic work. The tutor Noah worked with different tutees and ChatGPT together. In the first session he tried to get ChatGPT to help expand a paragraph. Noah and the tutee asked for three options. This gave the tutee ideas to finish his paper. In another session, the tutor asked for three ideas for a thesis statement from an already written introductory paragraph. The tutee used these ideas to create their own thesis statement. Another tutee was unhappy with their thesis statement. The tutor put that thesis statement into ChatGPT and asked for seven arguments for the thesis. This helped the tutee brainstorm for their paper. Noah then put an overly long thesis statement into ChatGPT and asked for three options to create a shorter thesis statement. The tutee took elements of each option because none of them were exactly right. Another tutee had a wordy sentence and Noah put it in ChatGPT to make it concise, again asking for three options. The tutee liked one of the options and was able to continue the paper. A doctoral student came to Noah for help coming up with a title for a section of a paper. The student used one of the three options given. The authors highlight that ChatGPT is becoming an increasingly valuable tool in tutoring, though its use should be situational and collaborative.
El-Kishky et. al. (2024, September 12). Learning to reason with LLMS. https://openai.com/index/learning-to-reason-with-llms
OpenAI’s o1 model demonstrates advanced capabilities, ranking in the 89th percentile on Codeforces competitive programming, placing among the top 500 in the USA Math Olympiad qualifier (AIME) and surpassing human PhD-level accuracy on science benchmarks (GPQA). An early version, OpenAI o1-preview, is available in ChatGPT and to select API users. Its training leverages a data-efficient reinforcement learning algorithm that enhances reasoning through chain-of-thought processes. Performance improves with increased training and test-time computation, and research proceeds to scale its unique constraints compared to traditional Large Language Models (LLM) pretraining. OpenAI’s o1 model demonstrates significant improvements over GPT-4o on reasoning-heavy tasks across human exams and ML benchmarks. It excels in challenging reasoning benchmarks, achieving better pass@1 accuracy and improved consensus performance with 64 samples. Notably, o1 outperforms GPT-4o in 54 out of 57 MMLU subcategories, with evaluations conducted using maximal test-time compute settings unless stated otherwise. OpenAI’s o1 model rivals human experts in reasoning-heavy benchmarks, excelling in areas like advanced math and science. On the 2024 AIME math exam, it significantly outperformed GPT-4o, solving 74% of problems with a single sample, 83% with consensus among 64 samples, and 93% using advanced re-ranking. This performance places it among the top 500 students nationally and above the USA Math Olympiad cutoff. In science, o1 surpassed PhD-level experts on the challenging GPQA-diamond benchmark, marking a first for AI. While not universally superior to PhDs, o1 demonstrated exceptional proficiency in specific problem-solving areas. It also achieved a groundbreaking 78.2% on MMMU with vision perception enabled, making it competitive with human experts.
Additionally, o1 outperformed GPT-4o across 54 of 57 MMLU subcategories, advancing the state-of-the-art in multiple benchmarks. OpenAI’s o1 model uses a chain-of-thought process, similar to how humans think through complex problems. Reinforcement learning helps it refine strategies, recognize and correct mistakes, break down difficult steps, and adopt new approaches when needed. This iterative process significantly enhances its reasoning abilities. OpenAI’s o1-preview model was evaluated against GPT-4o on challenging, open-ended prompts across various domains. Human trainers, reviewing anonymized responses, preferred o1-preview by a wide margin in reasoning-heavy tasks like data analysis, coding, and math. However, o1-preview was less favored for some natural language tasks, indicating it is not ideal for all use cases. The chain-of-thought reasoning approach in OpenAI’s o1-preview enhances alignment and safety by embedding safety rules into the model’s reasoning process. This method improves the model’s ability to understand and apply human values in context, leading to substantial gains in handling jailbreak attempts and meeting safety refusal benchmarks. Benefits include improved robustness in out-of-distribution scenarios and increased transparency in observing the model’s reasoning. Extensive safety testing and red-teaming, following the Preparedness Framework, confirmed these improvements, though instances of reward hacking were observed. This approach represents a significant step forward in aligning AI behavior with human principles. OpenAI sees the hidden chain of thought as a valuable tool for monitoring models, offering insight into their reasoning processes. This could help detect behaviors like user manipulation in the future. However, to maintain the chain’s authenticity, it cannot be influenced by compliance training or user preferences. To balance transparency, user experience, and competitive considerations, OpenAI has chosen not to make raw chains of thought visible to users. Instead, the o1 model series provides a model-generated summary, and efforts are made to include useful insights from the chain of thought in responses. OpenAI’s o1 model represents a significant leap in AI reasoning, with plans for iterative improvements. Its advanced reasoning capabilities are expected to enhance alignment with human values and unlock new applications in science, coding, math, and related fields. OpenAI looks forward to users and developers exploring how o1 can elevate their work.
Escalante, J., Pack, A., & Barrett, A. (2023). AI-generated feedback on writing: Insights into efficacy and ENL student preference. International Journal of Educational Technology in Higher Education, 20(1). https://doi.org/10.1186/s41239-023-00425-2
Large Language Models (LLMs) like ChatGPT can produce essays that meet university-level standards. LLMs can also translate essays for English Language Learners (ELLs) from their first language into English and assist with writing academic English essays. Educators are increasingly using tools to detect whether essays have been generated by AI. Opinions on AI systems are divided, with many believing that educational practices need to evolve to incorporate this technology. ChatGPT is designed to generate the most likely response to a prompt in clear and understandable language, and its capabilities are advancing rapidly. This has raised concerns among educators about potential misuse by students. Since writing is a crucial aspect of education, fostering critical thinking and self-discovery, reforming this system may seem challenging and unappealing. However, historical precedents like the introduction of the pocket calculator demonstrate that we as a society cannot ignore technological advancements.
Educators must reconsider the purpose of writing assignments in this evolving context. Building on principles of second language acquisition, Ingley (2023) suggested several practical ways to utilize GenAI to enhance academic writing in English language contexts. For instance, the authors recommend using AI-enabled chatbots for questioning and reflecting on the output to generate ideas or gain a deeper understanding of a topic, rather than merely asking the AI to brainstorm topics. The authors also see AI as a valuable tool for providing feedback throughout various stages of the writing process. It is crucial for educators to identify and address both the ethical and unethical uses of AI. Many standardized exams, such as the GRE and TOEFL, use AI to score a large volume of essays (Escalante).
Personalized learning through evaluation and feedback on essay writing has been identified as a potential strength of GenAI. However, it remains uncertain whether ChatGPT is an effective feedback tool or if students will find and act on its feedback. AI can be lenient until it is prompted multiple times. The general-purpose nature of the Large Language Models (LLMs) behind ChatGPT means its effectiveness as an automated writing evaluation tool needs further research before it can be widely implemented. In a recent study, ChatGPT provided more readable and detailed feedback compared to that of the instructor. AI will not get tired or burned out after reading essay after essay. Another study demonstrated that ChatGPT was both reasonable and reliable in giving feedback to 12,100 non-English speakers. One limitation of ChatGPT is its tendency to provide inaccurate information. It is important to remind students that not all AI-generated content is reliable and that the students should verify the information provided. Other concerns include bias and user privacy. Prompt engineering, the practice of optimizing prompt language to achieve the best performance from AI, can help improve outcomes. However, as a developing field, it is not yet clear whether prompt engineering consistently yields reliable corrective feedback from AI. Research has shown that English Language Learners (ELLs) often find human feedback confusing but can sometimes elicit responses from AI that are more understandable to them. AI does well in following Universal Design Learning (UDL) guidelines, that help ELLs understand the feedback that they have been given, studying and evaluating the effectiveness of ChatGPT and GPT-4 in enhancing language skills and assess student perceptions within the ELL population.
-
- Does the application of AI-generated feedback result in superior linguistic progress among ENL students compared to those who receive feedback from a human tutor?
- Does the preference for AI-generated feedback surpass that for human tutor-generated feedback among ELL students? (Escalante, 2023).
Both studies were conducted at a small liberal arts university in the Asia–Pacific region during the shortened Spring 2023 semester. The participants were divided into two groups: a control group that received feedback from a human tutor and an experimental group that received feedback generated by AI. In Study 2, students received feedback from both a human tutor and AI. In Study 1, a pre-and post-test design was used to measure linguistic progress. A diagnostic writing test administered on the first day of class served as the pretest, while a final writing exam served as the posttest. Participants were required to write a 300-word paragraph on various academic topics discussed in class and incorporate sources from readings for these assessments and a recurring weekly writing task. Study 2 assessed student preferences for human versus AI-generated feedback through a questionnaire that gathered both quantitative and qualitative data. No significant difference was found between the control and experimental groups; AI feedback was not more beneficial than human feedback. In Study 2, students were divided in their preference for feedback sources. Human feedback was only slightly less favored than AI, potentially due to incomplete survey responses. Many students preferred human feedback for its personal interaction and the ability to engage directly with a tutor, which the students found more engaging and beneficial for practicing both speaking and writing. The students also appreciated the opportunity for follow-up feedback. Conversely, students who favored AI-generated feedback valued its clarity, consistency, and specificity, particularly regarding academic writing style and vocabulary. The students also liked the flexibility of accessing feedback at any time without constraints of time or availability. Some students preferred to receive both human and AI feedback simultaneously. How can we balance the two forms of feedback?
Kahn, J. (2024). AI Isn’t Coming for Your Job—At Least Not Yet. Fortune, 189(2), 17–18.
When talking about layoffs, according to Kahn, people are talking about AI. UPS is automating jobs and laying off 12,000 office jobs. BlackRock, which is a financial company, is eliminating 600 jobs and using AI as one of the reasons for the layoff. Google is also laying off workers due to AI. Big Blue, associated with IBM, has stopped hiring for 7,800 roles because of AI. The fear of “mass unemployment” due to job losses caused by AI is raised. However, unemployment in the U.S. is at an all-time low. Challenger Gray & Christmas (2023) say that there have been 4,600 jobs cut due to AI from May to January 2023-2024. This caused some to say that firms were hiding the true extent of the job loss due to backlash. It is possible that AI will cost more people their jobs. Goldman Sachs (2023) predicts that AI will automate the equivalent of 300 million full-time roles globally by 2030. Economists and analysts say that at this time employers are not. AI rollout has been held back because the employers are still trying to figure out the technology. Most top executives are also aware of the “hallucinations” or errors that happen with AI. Erin Ling, an assistant professor specializing in AI and the future of work at the University of Surrey, in the U.K. says that many businesses are using AI as an excuse for bad business practices that are really causing the layoffs. I.E., the previous UPS stat. At the same time of the layoffs, UPS also missed forecast revenue. Google layoffs are due to AI, but not for the reasons many think. Google is laying off people to spend that money investing in AI technology.
It is noted that humans need to be there to craft the prompt and check the output at the end; Carl Benedikt Frey, the University of Oxford, says that there will be an “Uber effect”, which means that less skilled and less experienced workers will be able to take on higher leveled tasks. AI can create “copilots” that will let everyday people perform legal, financial, or software coding tasks. Frey thinks that means that there will be more people flocking to these industries and lowering prices. This may mean that wages may fall or stagnate. He also believes that eventually humans will not have to be as involved in the AI process and that will lead to job loss. AI means that people need to be able to learn new skills. This is a very optimistic view given people’s access to technology. There are many parts of the country that still lack reliable internet, so there is a question of whether everyday people will have access to the latest AI. Governments need to invest in education and retraining as well as access to the technology, which also means that workers can leverage shorter work hours with the increased input.
KU: Center for Teaching Excellence. (2024, July). Ethical use of AI in writing assignments. Ethical use of AI in writing assignments | Center for Teaching Excellence. https://cte.ku.edu/ethical-use-ai-writing-assignments
The Center asks writers to consider AI as part of a continuum where it can assist at almost any stage of the writing process whether it is for inspiration, ideation, structuring, or formatting. AI can also support research, provide feedback, summarize content, and aid in creation. Many instructors fear that AI will write students’ papers, which eliminate the thinking and intellectual struggle that forms ideas and arguments. Using unedited AI generated work is academic misconduct. AI is already widespread, and its capabilities and integrations are set to expand further in the coming years. As students in grade school and high school are increasingly using generative AI, they will come to college expecting to continue doing so. Writing is often seen as a product that showcases students’ understanding and skills, particularly in advanced classes. However, in most courses, the focus is on learning the writing process rather than achieving perfection. As students progress and writing professionals’ expectations increase, writing professionals still do not expect them to work in isolation—students receive feedback from various sources, including the writing center, collaborate with others, and draw on external influences. Viewing writing as a collaborative process allows us to consider how generative AI might be integrated into it. Kansas University believes students can learn to use generative AI effectively and ethically. Instead of viewing writing as an isolated task, it should be seen as a process that involves interacting with sources, ideas, tools, data, and others. Generative AI is just another tool in this collaborative process. Early in the writing process AI might help with generating ideas, narrowing the scope of the project, researching, outlining and creating an introduction. AI tools can assist students at various stages of the research process:
-
- Finding sources: Bing and Bard can help with initial research, while specialty tools like Semantic Scholar and Elicit offer more focused searches.
- Connecting ideas: Tools like Research Rabbit and Aria (for Zotero) create concept maps, showing connections among ideas, and Elicit identifies patterns across research.
- Organizing references: Software like EndNote and Zotero store, organize, and format references.
- Summarizing work: AI tools like ChatGPT and Bing can summarize research papers, aiding in source evaluation.
- Interrogating research: AI can analyze papers or websites, allowing researchers to ask questions and find related sources.
- Analyzing data: AI can create narratives from data, integrating it into writing.
- Identifying patterns: AI can analyze notes and ideas to uncover hidden patterns and connections.
Deeper in the writing process, AI can help writers create drafts, create titles or section headers, help with transitions and endings, getting feedback on details, and getting feedback on drafts. Generative AI has significant weaknesses, as it will produce answers even when it lacks appropriate information. Students are still responsible for any errors in their work and cannot blame AI. Instructors should guide students to understand both the strengths and limitations of generative AI, emphasizing the need to verify all details. While understanding the AI continuum is valuable, it does not answer the key question many instructors have: “How much is too much?” There’s no simple answer, as different disciplines may have varying approaches to generative AI, and instructors might set different boundaries depending on the assignment or student level. Here are some ways to consider an approach: talk with students about AI use, set boundaries, make the purpose of the assignment clear, include reflection, and engage with the writing center and the library.
Generative AI is rapidly advancing, with numerous tools already incorporating it and new ones emerging. When reflecting on how AI has already integrated into academic life, AI-augmented tools like spell-check and auto-correct were initially met with some resistance but not panic. Grammar checkers soon followed, offering writing advice with little opposition. Citation software has evolved, simplifying the management and formatting of sources. AI has been a part of search engines long before generative AI became widely recognized. Generative AI may seem new, but it does not introduce new forms of cheating. Students have long been able to buy papers, copy from online sources, or have others write for them. The real issue is not AI itself but rather existing challenges in academic structure, grading, trust, and purpose that instructors have been grappling with for years.
Lester, D. (2024). Tutors’ Column: GenAI in the Writing Center. WLN: A Journal of Writing Center Scholarship , 48(3). https://doi.org/10.37514/WLN-J.2024.48.3.05
Because of Siri and Alexa, AI has become so prevalent that it can no longer be overlooked. However, ChatGPT and Generative AI (GenAI) create a more concerning environment than in years past. Plagiarism is a huge issue with GenAI, but mostly it is the ability to mimic human text. There is a false idea that it can replace human writers and diminish authorial ownership, which would be disastrous for diversity in voice and language. Fortunately, writing centers are well-positioned to address these issues, and with widespread action, this could become an opportunity for systemic change. The author had ChatGPT analyze Siegfried Sassoon’s poem “Glory of Women.”, putting in the professor’s assignment and rubric. The author was relieved to see that what ChatGPT came up with was very wrong. It completely missed Sassoon’s irony. This means that ChatGPT was not a replacement for a human author. Would someone who was not as interested in analysis or writing a paper care whether or not the AI was wrong? If a student lacks interest in the assignment, and doesn’t see their writing as meaningful, then ChatGPT’s ineptitude would not matter to them. If a student does not want to do the reading and just copied and pasted the ChatGPT analysis, the student may still get a passing grade, because ChatGPT excels at following the rubric.
The author has seen how important it is to follow the rubric as both a student and a tutor. Some students will use AI to sound confident and academic just to pass their class. AI use and the trust of AI will change contemporary writing in both education and the writing centers. High stakes writing makes students fear failure and humiliation. Using AI instead can seem like an easy solution. Gen AI uniquely aggravates the problem that makes students want to cheat or plagiarize. If a student brings in a misguided analysis like the one mentioned by ChatGPT, a conversation could be had about AI and about the analysis. “I could then explain that the bot amounts to nothing more than several Google searches wearing an academic trench coat, warn them of overly relying on AI, and collaborate with the student to develop a deeper understanding of the poem” (Lester, 2024). However, a student could come in with a similarly misguided AI paper on a subject that the tutor is unfamiliar with. How can the tutor assist that writer if the tutor is unaware of the use of AI? As there is no foolproof way to detect AI written papers, tutors are only able to follow current writing center practices which are not equipped to adapt to current AI capabilities. However, these strategies can be surprisingly effective. There needs to be a proactive effort for teachers and tutors to understand the technology as students need guidance for effective and ethical use of AI.
There are multiple ways to use AI. The first is to get (seemingly) accurate answers, but also to generate a disagreeable answer for the writer to respond to. Writing center workers need to remember that many writers are afraid to disagree with AI because it is advertised as intelligent and objective. It is, however, riddled with biases and sometimes restricted by the creators and owners. Decisions are not made democratically, but by individuals and business owners who may be primarily motivated to profit rather than to inform AI users. These limitations need to be understood so that writing centers are aware of the dangers posed by GenAI in writing. GenAI, due to its creation and monetization, obscures genuine positional perspectives and restricts diversity in authorial voices. When relying on GenAI, authentic voices are filtered through a sieve where diversity and personal flair are removed for language that disproportionately favors Western ideologies or academic writing. Director of Canisius University’s writing center, Graham Stowe warns that GenAI has the potential to perpetuate systemic injustice when it diminishes authorial voice and diverse communication. He warns that the AI voice is hegemonic, and the linguistic systems embedded in AI would tend to be dominant. “Clean” or “error free” prose is touted as a benefit of GenAI. However, “neutral” or “correct” language tends to be from the group with social power.
White mainstream American English is GenAI’s default language which erases linguistic variation, reinforcing student ownership is a way to deal with fearful or unmotivated students. This tactic can be used whether or not the student has used AI in their paper. There was a student who felt disconnected to her writing. It was suspected that AI had been used in writing the paper on mass incarceration. She hated writing and expressed that she was bad at it. When asked why she picked that topic, she passionately spoke about it at length. When her words were read back to her, she was surprised at what she had said. That was a model of how to turn speech into writing. Whether or not GenAI was used, she was able to take renewed ownership of her words and her paper. Even though GenAI is the newest threat, academic writing has been exclusionary and homogeneous before.
Therefore, with or without AI writing centers are always conscious of student ownership and honoring diverse positionalities, voice, and linguistic variation. Although tutors have some tools to address linguistic racism, GenAI exacerbates the fundamental, systemic issues within writing, heightening the need for more radical, community-wide changes in writing centers and classrooms. This has been a problem for a long time and there is no magic bullet to fix it. However, if there is any potential benefit to GenAI’s proficiency and adherence to white mainstream American English (despite its negative impact on voice and language), it is that GenAI has highlighted the often-elusive linguistic shift towards white patriarchy. This issue presents an opportunity to introduce more innovative and radical practices to address systemic linguistic injustice, building on current approaches to technology literacy and fostering a sense of ownership.
Manyika, J. (2022). Getting AI Right: Introductory Notes on AI & Society. Daedalus, 151(2), 5–27. https://www.jstor.org/stable/48662023
Manyika points out that the prospect of achieving powerful artificial general intelligence has long captivated the imagination of society, not only for its thrilling and troubling possibilities but also for its promise of ushering in a new, uncharted era for humanity. Stuart Russell, in his 21 BBC Reith Lectures, titled “Living with Artificial Intelligence,” says, “The eventual emergence of general-purpose artificial intelligence [will be] the biggest event in human history.” There has been a rapid succession of impressive results in the past decade. In machine vision, researchers developed systems capable of recognizing objects as well as or even better than humans in certain situations. Complex strategy games have long been linked with superior intelligence, so when AI systems defeated top human players in chess, Atari games, Go, shogi, StarCraft, and Dota, it captured global attention. The escalating progression of their methods began with learning from expert human play, then advanced to self-play, and eventually involved teaching themselves the fundamental principles of the games from scratch. This led to the development of single systems capable of learning, playing, and winning across several structurally different games, suggesting the potential for generally intelligent systems.
There has been a recent emergence of Large Language Models (LLMs) capable of generating human-like input. Progress in language is especially significant due to the central role language has always played in human concepts of intelligence, reasoning, and understanding. There have also been breakthroughs across various scientific fields where AI and related techniques have been employed to advance research, spanning from materials and environmental sciences to high-energy physics and astronomy. An example is the results on the fifty-year-old protein-folding problem by AlphaFold. AI and its associated technologies are already present and permeate our daily lives more than many people realize. Examples include recommendation systems, search, language translators–now covering more than one hundred languages–facial recognition, speech to text (and back), digital assistants, chatbots for customer service, fraud detection, decision support systems, energy management systems, and tools for scientific research, etc. Nick Bostrom, the director of the Future of Humanity Institute at the University of Oxford, noted in 2006, “A lot of cutting-edge AI has filtered into general applications, often without being called AI because once something becomes useful enough and common enough it’s not labeled AI anymore.”
Concerns arise when these systems have not performed well, such as with bias in facial recognition, or have been misused, as in the case of deepfakes. There are also issues when AI causes harm, such as with crime prediction, or are linked to accidents, such as fatalities involving self-driving cars. A definition of AI is “The ability of a digital computer or computer-controlled robot to perform tasks commonly associated with intelligent beings.” The human abilities referenced in such definitions encompass visual perception, speech recognition, reasoning, problem-solving, meaning discovery, generalization, and learning from experience. Another definition that is not quite as human-centric is “Any system that perceives its environment and takes actions that maximize its chance of achieving its goals.” Concepts of intelligent machines have origins dating back to antiquity. Philosophers such as Hobbes, Leibniz, and Descartes have long envisioned artificial intelligence, with Daniel Dennett even suggesting that Descartes may have anticipated the Turing Test. The idea of computation-based machine intelligence is linked to Alan Turing’s invention of the universal Turing machine in the 1930s and the ideas of several of his mid-twentieth-century contemporaries. However, the formal birth of artificial intelligence and the use of the term is commonly attributed to the Dartmouth summer workshop of 1956. This workshop emerged from a proposal by John McCarthy, Marvin Minsky, Nathaniel Rochester, and Claude Shannon, who aimed to explore how to enable machines to use language, form abstractions and concepts, solve problems typically reserved for humans, and improve themselves.
The current AI resurgence has been ongoing since the 1990s, with notable breakthroughs emerging rapidly over the past decade. Jeffrey Dean refers to this period as a “golden decade” in his essay, highlighting not only the swift advancement in AI but also its widespread application across various societal sectors and scientific research. This era is marked by a focus on achieving artificial intelligence through experiential learning, characterized by the success of neural networks, deep learning, reinforcement learning, and methods from probability theory as key approaches for machine learning. In the 1950s, two dominant visions emerged for achieving machine intelligence. The first vision involved using computers to create logical and symbolic representations of the world and human knowledge, aiming to develop systems capable of reasoning about the world and exhibiting intelligence similar to the human mind. This approach was championed by Allen Newell, Herbert Simon, Marvin Minsky, and others. It was closely linked to the “heuristic search” method, which posited that intelligence involved exploring a space of possibilities for solutions.
The second vision, inspired by the brain rather than the mind, focused on learning to achieve intelligence. This approach, known as the connectionist approach, involved connecting units called perceptrons in a manner analogous to the connections between neurons in the brain. At that time, this approach was primarily associated with Frank Rosenblatt. The heuristic search method began to dominate, even if there was excitement for both visions. There were some successes, including so-called expert systems. This approach not only gained support from its advocates and substantial funding but also drew upon a rich intellectual tradition—represented by figures such as Descartes, Boole, Frege, Russell, and Church—that focused on manipulating symbols and formalizing knowledge and reasoning. Interest in the second vision began to resurface only in the late 1980s, largely due to the contributions of David Rumelhart, Geoffrey Hinton, James McClelland, and others. The current dominant approach to intelligence is characterized by learning-based methods, the use of statistical techniques, back-propagation, and both supervised and unsupervised training. In his essay “I Do Not Think It Means What You Think It Means: Artificial Intelligence, Cognitive Work & Scale,” Kevin Scott highlights the contributions of Ray Solomonoff and others who connected information and probability theory with the concept of machines capable of learning, compressing, and generalizing knowledge. This vision is now becoming a reality in the systems being developed and those on the horizon. The success of the machine learning approach has been greatly enhanced by the surge in data availability, driven by the growth of the Internet and various applications and services. Additionally, the data explosion in research has been fueled by advancements in scientific instruments, observation platforms, and breakthroughs in fields like astronomy and genomics.
Equally crucial has been the parallel development of software and hardware, particularly chip architectures designed for the parallel computations required by data-intensive neural networks and other machine learning methods, as discussed by Dean. In their essay, “Searching for Computer Vision North Stars,” Fei-Fei Li and Ranjay Krishna (2022) examine progress in machine vision and the creation of benchmark data sets like ImageNet. Chris Manning and Yejin Choi (2022), in their essays “Human Language Understanding & Reasoning” and “The Curious Case of Common-Sense Intelligence,” discuss various eras and concepts in natural language processing, including the rise of large language models with hundreds of billions of parameters. These models use transformer architectures and self-supervised learning on extensive data sets. As Mira Murati (2022) illustrates in “Language & Coding Creativity,” these pretrained models excel at generating human-like outputs in natural language, as well as in images, software code, and more. These large language models are increasingly referred to as foundational models due to their adaptability to a broad range of tasks once trained.
Despite their impressive capabilities, these models are still in their early stages of development and have notable shortcomings and limitations, as discussed in this volume and by some of their developers. In “The Machines from Our Future,” Daniela Rus (2022) examines advancements in robotic systems, focusing on improvements in both the underlying technologies and their integrated design that enables them to function effectively in the physical world. She points out the limitations of current “industrial” approaches and proposes new conceptualizations of robots based on insights from biological systems. In robotics, as with AI in general, there has always been a debate about whether to replicate or simply draw inspiration from the ways humans and other biological organisms exhibit intelligent behavior. AI researcher Demis Hassabis and his team have explored the mutual influence between neuroscience and AI, though, as noted by Alexis Baria and Keith Cross, the inspiration has so far flowed more from neuroscience to AI than vice versa.
AI continues to face significant shortcomings and limitations, including issues of performance, safety, bias, privacy, misinformation, and explainability, which can undermine public trust. These concerns have drawn increasing attention from the public, regulators, and researchers, leading to efforts to develop responsible AI principles and best practices. Additionally, there is a notable lack of diversity in AI research and development, particularly regarding gender and race, which impacts both the technology itself and its broader societal implications. In their Turing Lecture, deep learning pioneers Yoshua Bengio, Yann LeCun, and Geoffrey Hinton reviewed the current state of deep learning and its limitations, such as challenges with out-of-distribution generalization. In natural language processing, Chris Manning and Yejin Choi highlighted persistent difficulties in reasoning and common-sense understanding despite the impressive performance of large language models. Computational linguists Emily Bender and Alexander Koller have critiqued the idea that these models genuinely understand, learn, or create meaning. Kobi Gal and Barbara Grosz, in their discussion on multi-agent systems, addressed complex issues related to reasoning about other agents and ethical challenges in both cooperative and competitive environments involving humans and machines. Additionally, Allan Dafoe (2020) and others have outlined open problems in cooperative AI, revealing a growing concern about the lack of adequate theories for integrating AI systems into society as their capabilities and applications expand.
There are instances where embedded AI capabilities not only assist in evaluating results but also guide experiments, surpassing traditional heuristic-based experimental design to create what some refer to as “self-driving laboratories.” However, enabling AI to understand science and mathematics, theorize, and develop novel concepts remains a significant challenge. The potential for more advanced AI to drive new scientific discoveries and make transformative progress in addressing major challenges and opportunities in the humanities has long been a major motivator for researchers at the forefront of AI to develop more capable systems. General hard problems that continue to constrain the advancement of AI include one-shot learning, cross-domain generalization, causal reasoning, grounding, time and memory complexities, and meta-cognition. This raises the question of whether current methods—primarily deep learning, larger and more foundational models, and reinforcement learning—are adequate or if new conceptual approaches, such as those inspired by neuroscience or based on logic and probability theory, are needed.
While opinions vary within the AI community, many believe that current approaches and ongoing advancements in computing and learning architectures have not yet reached their full potential. The debate over the adequacy of current AI approaches is closely tied to the question of whether artificial general intelligence (AGI) can be achieved, and if so, how and when. Unlike narrow AI, which is designed for specific tasks, AGI aims to create an AI as versatile and powerful as humans, capable of handling any problem, evolving and improving itself, and setting its own goals. While opinions vary on the feasibility and timeline for achieving AGI, there is a consensus that its development would have significant and potentially transformative implications, both positive and negative. Regardless of its timeline, there is increasing consensus among leading AI researchers that we as a society should prepare for the potential development of powerful AGI. This includes addressing issues related to safety, control, alignment with human values, governance, and the possibility of multiple types of AGI. These considerations should be integral to how we as a society approach AGI development. Most AI investment, research, development, and commercial activity today focus on narrow AI and its various applications, driven by the potential for practical and economic benefits across multiple sectors. However, a few organizations, notably DeepMind and OpenAI, are dedicated to developing AGI. While these organizations have made significant progress towards more general AI capabilities, AGI remains a distant goal.
The societal impact of AI and automation on jobs and the future of work has been a topic of discussion for decades. In 1964, President Lyndon Johnson established a commission that concluded technology, while disruptive, ultimately destroys jobs but not work, and contributes to economic growth. Recent studies echo this, indicating that technology often leads to more job creation over time, though challenges arise from sectoral and occupational transitions, skill requirements, and wage effects. Laura Tyson and John Zysman (2022) explore these work-related implications, while Michael Spence (2014) addresses income and wealth distribution issues, particularly in developing countries. Erik Brynjolfsson (2022) warns that using human benchmarks in AI development might lead to AI that replaces rather than complements human labor, with the direction and outcomes depending on the incentives for researchers, companies, and governments. There is concern that the optimistic view that AI will create more jobs than it destroys may be overly reliant on historical patterns and not fully account for future developments in AI.
Key arguments for why AI might differ from past technological changes include: the rapid pace of technological advancement compared to the slow adaptation of labor markets and societal systems, and AIs expanding role beyond mechanizing physical and routine tasks to potentially performing cognitive, creative, and socioempathic tasks. This shift suggests that AIs capabilities may soon match the range of problems addressed by the human mind, as envisioned by Herbert Simon and Allen Newell in 1957. When considering the potential impacts of AI on the future of work, two main responses often arise: 1. New labor markets will emerge where human-performed tasks are valued for their intrinsic worth, even if machines can perform them as well or better. 2. AI might generate such vast wealth and abundance that it could meet everyone’s needs without requiring human labor, leading to the challenge Keynes envisioned of how to use newfound leisure wisely. However, most researchers believe we as a society are far from reaching this scenario.
In the meantime, challenges such as inequality, wage effects, education, and the integration of humans with advanced machines need to be addressed, as discussed by Laura Tyson, John Zysman, Michael Spence, and Erik Brynjolfsson. AI impacts more than just employment; it also influences the broader economy. Russell (YEAR) estimates that the potential economic impact of fully realized artificial general intelligence could be enormous, with a global GDP reaching $750 trillion—ten times the current figure. Even before achieving AGI, narrow AI and related technologies offer significant commercial opportunities and potential productivity gains, driving intense competition among companies and nations. While the U.S. is often seen as leading in AI, China is rapidly emerging as a major player, bolstered by its advancements in AI research, infrastructure, and ecosystems. This competition may alter market dynamics and affect how companies and countries approach AI development.
Additionally, the competitive environment could hinder responsible AI practices and collective action on issues like safety, as noted by Amanda Askell, Miles Brundage, and Gillian Hadfield. Nations are motivated to lead in AI not only for economic reasons but also for national security. AIs role in areas such as surveillance, cyber operations, defense systems, and even disinformation highlights its growing importance in global security. Eric Schmidt, co-chair of the U.S. National Security Commission on Artificial Intelligence, outlines the risks AI poses to international stability and advocates for exploring shared limits and treaties on AI, even among rivals. He also emphasizes the need for confidence-building measures to manage risks and foster trust. Meanwhile, Russell and Shadbolt raise concerns about autonomous weapons and weaponized AI. In “The Moral Dimension of AI-Assisted Decision-Making: Some Practical Perspectives from the Front Lines,” former Secretary of Defense Ash Carter draws lessons from other national security technologies, like nuclear weapons, to address the ethics of AI in automated decision-making. He notes significant differences between AI and nuclear technologies, particularly that AI development is driven by the private sector with global ambitions. Schmidt (2022) highlights how AIs development tends to centralize around leading entities, creating tension between commercial interests and national security concerns. The implications for entities not leading in AI development, including potential benefits and contributions to global challenges, are not fully explored. The COVID-19 pandemic serves as a stark example of the human cost when key discoveries, such as vaccines, are not shared equitably.
As AI usage has expanded across various sectors, such as healthcare, finance, public services, and commerce, it has often enhanced effectiveness and optimized costs and performance. However, it has also raised issues of bias and fairness, typically stemming from biased training data and societal systems. Sonia Katyal (2022), in “Democracy & Distrust in an Era of Artificial Intelligence,” argues that AI systems, like political systems, can foster distrust when lacking representation and participation. Cynthia Dwork and Martha Minow (2022), in “Distrust of Artificial Intelligence: Sources & Responses from Computer Science & Law,” address the problems of missing ground truth and the conflicts between user utility and privacy concerns. To address these challenges while harnessing AIs benefits, Mariano-Florentino Cuéllar and Aziz Huq explore the concept of “artificially intelligent regulation.” Governments and organizations are increasingly drawn to AI for its potential to create powerful “seeing rooms” for observing and optimizing data. Diane Coyle, in “Socializing Data,” examines the history and risks associated with such systems, especially when markets drive the use of social data.
For the public sector, AI offers significant opportunities to enhance service delivery and effectiveness. Helen Margetts, in “Rethinking AI for Good Governance,” explores what AI might look like in the public sector, identifying key challenges related to resource allocation and normative considerations. She discusses how governments can leverage AI for the most ambitious and beneficial societal outcomes. As AI increasingly replicates or surpasses human traits like intelligence, creativity, and empathy, it raises profound questions about what it means to be human. How much of human identity relies on the mystery and limitations of human capabilities? What happens when these aspects are replicated or enhanced by machines? This prompts a reexamination of our socioeconomic systems, social infrastructure, and concepts of justice, representation, and inclusion, challenging us to adapt our notions and policies for the age of AI.
Despite their limitations, large language models offer a unique opportunity to explore fundamental questions about humanity in the context of advanced machines. As Tobias Rees notes, these models act as a “laboratory” for examining society’s relationship with AI. Reflecting Dennett’s view from 1988, AI has not resolved ancient philosophical questions but has introduced new ways to expand society’s philosophical thinking. Mira Murati (2022) discusses how humans might interact with machines capable of generating human-like creativity, using examples from GPT-3, and suggests that we as a society may need to reconsider our roles in work and creative endeavors. Blaise Agüera y Arcas, in his essay “Do Large Language Models Understand Us?”, challenges whether society’s standards for intelligence, understanding, and consciousness are constantly shifting to keep these traits uniquely human. He uses outputs from Google’s LaMDA to explore these ideas. Pamela McCorduck’s historical perspective from Machines Who Think (1979) highlights how advances in AI have often been dismissed as not true thinking. Agüera y Arcas (2022) questions whether the processes behind machine “thinking” differ from human thinking in meaningful ways. Tobias Rees, in “Non-Human Words: On GPT-3 as a Philosophical Laboratory,” examines how historical views of language and the human experience may evolve, potentially leading to a future where language is distinct from humanity. In “Signs Taken for Wonders: AI, Art & the Matter of Race,” Michele Elam discusses how transformative technologies like AI can shape and narrow society’s understanding of humanity by promoting particular views while excluding others, particularly those of marginalized communities. She emphasizes the importance of including diverse perspectives in AI development. Timnit Gebru has highlighted how AI can exacerbate systemic discrimination against marginalized groups. Additionally, Blaise Agüera y Arcas, Margaret Mitchell, and Alexander Todorov critique how AIs use in correlating physical traits with nonphysical attributes can perpetuate harmful essentialist thinking, echoing past practices of physiognomy.
Advances in AI not only heighten the importance of addressing ethical issues but also expose existing societal problems. Algorithms and automated decision-making can reinforce and magnify societal inequalities while potentially introducing new forms of inequality. This challenge presents an opportunity for society to rethink its understanding of fairness. Iason Gabriel, in “Toward a Theory of Justice for Artificial Intelligence,” uses Rawls’s theory of justice to explore how AI intersects with distributive justice, suggesting that the unique nature of AI may require new approaches to ensuring justice and equality. As AI advances, aligning it with human safety, control, goals, and values becomes increasingly important. This issue, noted by Turing and discussed by Russell, is debated among researchers. While some think concerns about AI control are exaggerated, others stress the need to address these issues early in AI development. Russell suggests using inverse reinforcement learning to help AI understand human preferences, while Gabriel explores various complex interpretations of alignment. Both highlight significant normative challenges and the difficulties posed by the variability of human preferences. John Tasioulas, in “Artificial Intelligence, Humanistic Ethics,” argues that aligning AI with human preferences is not enough; we as a society must first determine what those preferences should be. He critiques the reliance on preference utilitarianism and market mechanisms, often driven by wealth maximization and GDP, which can overshadow nonmarket values. Tasioulas suggests that advanced AI reveals the ongoing need to address complex ethical issues, such as fairness and moral decision-making, highlighting that these are no longer abstract problems but practical concerns in developing autonomous systems.
Throughout the history of AI, the researchers have frequently asked how its performance compares to human capabilities in various tasks, from playing games to creating art. This question can be enriched by comparing AI to both the best human performers in these tasks and to average humans engaged in similar activities. As AI capabilities advance, the societal implications become more complex. For instance, should we as a society raise standards for human performance in safety-critical tasks to match AI capabilities? Determining when AI is “good enough” is another critical consideration. Ultimately, the key comparison is whether AI can meet societal needs and benefit society as intended. The choices we as a society make about AIs design, development, and deployment will shape its impact on society. A logical response might be to halt all AI development and deployment, essentially putting the genie back in the bottle. However, this approach seems impractical given the immense economic and strategic stakes, the fierce competition between countries and companies, and the tangible benefits AI provides to its users. Additionally, the promising opportunities AI offers to society, some of which are already realized, make a complete halt unlikely. Is it worth it? The affirmative answer is conditional and depends on addressing and resolving these challenging issues. Currently, most human ingenuity, effort, and resources are disproportionately directed towards commercial applications and the economic potential of AI, rather than the critical issues necessary for AI to truly benefit humanity. We have the opportunity to shift this focus.
Maphoto, K. B., Sevnarayan, K., Mohale, N. E., Suliman, Z., Ntsopi, T. J., & Mokoena, D. (2024). Advancing students’ academic excellence in distance education: Exploring the potential of Generative AI integration to improve academic writing skills. Open Praxis, 16(2), 142–159. https://doi.org/10.55982/openpraxis.16.2.649
Research highlights that students often struggle with academic writing due to a lack of preparation and skills, which are crucial for academic success. In response to this “academic unpreparedness,” there is an increasing use of generative AI-powered writing tools. These AI tools, including Large Language Models (LLMs), machine translators (MTs), digital writing assistants (DWAs), and ChatGPT, create personalized and inclusive learning environments that address diverse student needs. While these tools are enhancing academic writing modules by facilitating assessment, tutoring, content generation, and feedback, LLMs also raise concerns about academic misconduct. Van Dis et. al. (2023) says, “ChatGPT can create well-written student essays, summarize research papers, answer questions well enough to pass medical exams, enhance academic writing, and generate helpful computer codes.” The integration of AI-powered writing tools in higher education, especially in distance education, presents significant challenges. Key concerns include academic misconduct, over-reliance on technology, authenticity issues, and the risk of promoting passive learning. Turnitin, widely used for plagiarism detection, faces criticism for its limited effectiveness in detecting AI-generated content and extensive quoting.
While generative AI can enhance engagement and alleviate monotony in academic writing, its integration raises questions about balancing AI and human intelligence in education. Further research is needed to determine the optimal use of AI tools to improve academic writing skills and academic excellence. “What are lecturers, students, and markers’ perceptions towards the use of generative AI on academic writing skills in the context of ODeL?” (Maphoto et. al. 2024). Recent advancements have led to the increased use of generative AI tools like ChatGPT, Pro Writing Aid, and Grammarly in educational settings, with many lecturers supporting their integration. These tools are seen as helpful for identifying errors, improving structure, and enhancing writing features, fostering student responsibility by offering insights and suggestions. However, concerns about plagiarism and ethical use of AI tools persist, as highlighted by scholars like Aljohani, Khalil and Er, and Sevnarayan and Maphoto. To mitigate these risks, suggestions include revising assessment methods and setting usage guidelines. Despite some reservations, recent studies by Kiryakova and Angelova (2023) show a generally positive attitude among lecturers towards AI integration. Bozkurt emphasizes the need for transparent reporting and human oversight to maintain academic integrity and address ethical concerns. Understanding students’ views on generative AI in academic writing is essential as educational contexts evolve. Students have mixed feelings: while students value the efficiency and personalized learning AI offers, they worry about over-reliance, creativity, and authenticity.
Generative AI has improved writing quality, grammar, vocabulary, and engagement, and has helped reduce plagiarism. However, issues with plagiarism detection and the impact on writing authenticity remain concerns. Integrating generative AI in education enhances efficiency, quality, and learning opportunities by transforming academic writing into a continuous process. It offers personalized feedback, supports iterative improvement, and uses gamification to boost engagement and motivation. By recognizing achievements and outlining clear skill development pathways, AI helps students improve their writing and develop a positive attitude toward academic challenges. Additionally, AI tools assist with plagiarism detection and guide students in adapting their writing styles, further improving learning outcomes. The literature highlights benefits of integrating generative AI in academic writing, but it also reveals biases and inconsistencies, such as issues with plagiarism, ethics, privacy, and digital competencies. Further research should delve deeper into students’ perceptions, particularly regarding authenticity, creativity, and data privacy. Additionally, future studies should investigate the long-term effects of generative AI on writing skill development and student learning outcomes across various educational contexts.
The integration of generative AI in academic writing can be effectively explored using socio-cultural theory (SCT), developed by Lev Vygotsky. SCT focuses on the impact of social interaction and cultural context on cognitive development. It helps investigate how generative AI supports students within their zone of proximal development (ZPD), providing necessary scaffolding for writing skills. SCT allows researchers to examine the social and cultural dimensions of language learning with AI, emphasizing the interactions between students and the tool and the broader educational context. This theory supports a human-AI collaboration framework by considering cultural, social, and contextual factors and fostering a balance between human creativity and technological systems. The human-AI collaboration framework emphasizes the importance of trust in adopting generative AI and identifies limited awareness as a barrier to broader acceptance. It examines how generative AI, particularly LLMs, can enhance academic writing by boosting productivity, quality, and creativity. The framework underscores the value of LLMs for educational empowerment, highlighting their adaptive interactivity and real-time feedback, which resemble the benefits of one-to-one tutoring. This collaborative approach addresses current challenges and opportunities, impacting both immediate writing projects and long-term educational development.
This study focuses on online distance learning modules at a South African open distance e-learning (ODeL) university, which enrolls about 500,000 students annually from 132 countries, including Nigeria, Namibia, Zimbabwe, India, Congo, Ethiopia, the USA, and China. The diverse student body spans various financial, linguistic, and social backgrounds, with ages ranging from 18 to 70. The Academic Writing (WRI124) module, offered by the Department of English Studies, aims to improve students’ academic English proficiency, critical reading, and writing skills. WRI124 serves a diverse student body, including both native English speakers and those for whom English is a second language. This fully online module conducts all assessments through the Moodle learning management system (LMS). Most students study part-time while working full-time and come from various ethnic backgrounds, including Black, White, Indian, and Asian, with many from middle- to low-income families. Many students, who live in remote areas with limited internet access, complete their assessments using cell phones or at local internet cafes.
In the initial phase of ChatGPT’s introduction, researchers observed a significant increase in its use for assignments during the first semester of 2023. This led to a sense of helplessness among lecturers, who struggled to address the new issue of excessive reliance on generative AI for online assessments. While students adeptly used the technology, a digital divide emerged, with some lecturers either struggling with or not understanding the technology. This situation prompted the researchers to explore the integration of generative AI in the WRI124 module. The study aimed to investigate the potential of generative AI to enhance students’ academic writing skills, noting that lecturers did not use AI as a teaching tool and that the study was not a pilot project. Discussions and interviews were the primary methods used to explore phenomena in this study. These approaches were chosen to gain deep insights from a large-enrollment module and assess whether generative AI can be effectively used in Academic Writing modules to improve students’ writing skills. This research employed a phenomenological approach to deeply understand participants’ experiences with generative AI.
By focusing on lived experiences, phenomenology provides a powerful framework for uncovering the deeper layers of how generative AI impacts participants. The study aimed to offer profound insights into these experiences, going beyond superficial observations. This research used a qualitative methodology to explore various approaches and applications of generative AI in teaching and learning. To address the research questions, a triangulation of research instruments was employed, leveraging the strengths of each to gain a comprehensive understanding of the impact of generative AI on academic writing. The study employed three research instruments: e-mail interviews, focus group discussions (FGDs), and a WhatsApp group discussion to enable a comprehensive exploration. E-mail interviews, based on Hunt and McHale’s framework, were used for asynchronous communication with lecturers. This method was crucial for systematically collecting detailed information on technical reliability, multimedia integration, engagement strategies, communication dynamics, and collaborative aspects of using generative AI in the WRI124 module. The study focused on approximately 14,000 students enrolled in the WRI124 module. Twenty lecturers from the Department of English Studies were interviewed via email and named as ‘Lecturer 1’, ‘Lecturer 2’, etc. Twenty students participated in a focus group discussion via the module’s Telegram group, labeled ‘Student 1’, ‘Student 2’, etc. Additionally, thirty markers provided insights on a WhatsApp group about how generative AI can motivate students to improve their academic writing, with markers identified as ‘Marker 1’, ‘Marker 2′, etc. Purposive sampling was used to select participants to ensure a detailed exploration of their perceptions. The research employed a multifaceted approach to investigate stakeholders’ views on integrating generative AI into academic writing at an ODeL university. Data collection occurred in November and December 2023.
To address the first research question about lecturers’ perceptions, 20 lecturers were emailed interview questions, with 12 responding over two weeks. For the second question on student perceptions, 20 students were invited to a focus group discussion via the module Telegram group, but only 12 participated in a 45-minute Microsoft Teams session. For the third question, 30 markers were queried in a WhatsApp group, with only 10 participating in the discussion. The research explored three main themes: 1) lecturers’ perceptions of using generative AI in the WRI124 module, 2) students’ views on how generative AI could guide them in academic writing, and 3) the potential of generative AI to motivate students to enhance their writing skills. To ensure content validity, the research used a multifaceted approach with diverse data collection methods, including e-mail interviews with lecturers, a focus group discussion with students, and a WhatsApp group discussion with markers. This triangulation enhanced the exploration of stakeholders’ perceptions of generative AIs potential in academic writing and strengthened the credibility and richness of the insights gathered. The research questions were carefully crafted and validated by experts in educational technology and qualitative research, ensuring their relevance and trustworthiness. For reliability, the study maintained rigorous inter-rater reliability, consistent purposive sampling, and standardized data collection procedures to ensure the consistency and dependability of the findings.
Lecturers expressed a critical view on balancing traditional and technological approaches to academic writing. Lecturer 1 emphasized that while generative AI can be a useful tool, students should not rely on it as a quick fix but should immerse themselves in traditional writing practices. Lecturer 2 also supported the idea of combining traditional and technological methods, stressing that AI should be viewed as a tool, not a shortcut, and that students must use it responsibly. Lecturer 3 advocated for integrating traditional skills with AI benefits, rather than replacing one with the other, to achieve comprehensive learning. The findings highlighted the need for a balanced approach, where generative AI complements traditional methods and student responsibility is emphasized in using AI effectively. This theme addressed the need to adapt to technological advancements in language learning while being cautious of overreliance on generative AI. Lecturer 4 emphasized understanding AIs role in enhancing learning but warns against using it to complete assignments. Lecturer 5 supported the integration of technology as a supplement, not a replacement for human effort, while stressing the importance of caution. Lecturer 6 highlighted that AI should amplify students’ efforts rather than replace active engagement in learning. Overall, the findings acknowledged AIs potential benefits but stress the importance of using it as a supplementary tool and maintaining active student participation.
In exploring generative AI for enhancing writing skills, Lecturer 7 advocated for a balanced approach where students independently write and use AI only for text correction and comparison. Lecturer 8 supported AI as a tool for development but stresses that students must write independently, with AI serving to enhance their work. Lecturer 9 reinforced that while AI can assist in refining writing, students need to actively engage in the writing process for true skill development. All three lecturers recognized AIs potential in improving writing but emphasized that it should supplement, not replace, personal effort. Lecturer 9 notably highlighted the critical role of active student involvement alongside AI use. Lecturers emphasized their crucial role in guiding students on the use of generative AI. Lecturer 10 stressed the need for clear instructions to prevent misuse and optimize AIs benefits. Lecturer 11 highlighted the importance of demystifying AI and setting expectations to ensure responsible use. Lecturer 12 added that lecturers should guide students in using AI in alignment with learning objectives and academic standards. The findings underscored that lecturers must actively engage in facilitating responsible and effective AI use in academic writing, ensuring it supports educational goals and maintains academic integrity.
In the focus group discussion, first-year students, particularly those for whom English is a second language, expressed significant skepticism about using generative AI in academic writing. Concerns included AIs ability to understand multilingual expressions (Student 1), fears of AI altering personal writing style and authenticity (Student 2), and doubts about AIs reliability impacting academic performance (Student 3). However, Student 10 offered a contrasting view, seeing AI as a valuable tool for enhancing clarity and generating ideas, suggesting a willingness to embrace its benefits. The discussion highlighted a mix of apprehension and optimism among students regarding generative AIs role in their writing. The findings showed that students are generally optimistic about generative AIs potential to enhance their writing skills, especially in areas like grammar and structure. Student 4 viewed AI as a valuable tool for addressing writing challenges, while Student 5 saw it as a supportive companion available 24/7. Student 6 highlighted AIs practicality in saving time and reducing stress. However, Student 11 raised concerns about over-reliance on AI leading to complacency and a lack of personal effort. This perspective underscored the need for a balanced approach, ensuring AI supplements rather than replaces students’ own efforts in improving their writing skills. Concerns about the authenticity of learning experiences with generative AI were evident in the findings. Student 7 feared that excessive use of AI might diminish personal learning and self-reliance. Student 8 questioned whether AI suggestions truly improve writing or just make it technically correct, stressing the need for authenticity. Student 9 was skeptical about AIs ability to understand cultural nuances in writing. In contrast, Student 12 viewed AI as a tool that enhances efficiency without replacing creativity, seeing it as beneficial for streamlining editing. Overall, while there were concerns about overreliance and authenticity, there was also recognition of AIs potential to improve productivity.
Markers expressed mixed views on integrating generative AI in education. Concerns included the risk of setting a precedent that encourages shortcuts and undermines educational integrity. Marker 1 worried about potential negative long-term effects, while Marker 3 emphasized the irreplaceable value of hard work in education. In contrast, Marker 10 saw AI as a potential motivational tool that, if used carefully, could help address plagiarism issues. This range of perspectives underscored the challenge of balancing technological advancements with traditional educational values. Markers generally agreed that generative AI can be a valuable educational tool if used thoughtfully. Marker 2 supported AI as a supplementary tool that can aid in understanding concepts. Marker 5 viewed AI as a crucial technological advancement and emphasized the need for lecturers to be knowledgeable about it. However, Marker 7 cautioned against excessive reliance on AI, stressing that it should be used critically rather than for direct copying, to avoid issues of plagiarism and ensure genuine learning. This suggested that while there is broad support for AIs potential, there are varying opinions on how it should be integrated into educational practices. Markers collectively advocated for responsible use of generative AI in education. Marker 6 suggested using AI as a brainstorming tool to support creativity rather than replace it. Marker 8 saw AI as beneficial for enhancing students’ writing but warned against misuse. Marker 9 supported integrating AI into academic writing instruction while emphasizing the importance of teaching responsible use. The discussion highlighted the need to balance encouraging AIs motivational potential with maintaining academic integrity. Lecturers’ perspectives on integrating generative AI into the WRI124 module revealed a nuanced dialogue.
While concerns about misuse and shortcuts persist, lecturers also recognized AIs potential benefits. Lecturers advocated for AI as both a challenge and an asset, emphasizing its role as a supplementary tool rather than a quick fix. This approach aligns with Social Cognitive Theory (SCT) and Vygotsky’s Zone of Proximal Development (ZPD), stressing the importance of student responsibility and active engagement. The integration of AI was seen as a mixed-initiative decision, balancing technology with essential human effort. Lecturers called for pedagogical strategies that maintain this balance, ensuring AI enhances but does not replace independent student work. Students’ views on using generative AI in the WRI124 module reflected a range of attitudes, marked by skepticism, optimism, and concerns about authenticity. These perspectives aligned with scholarly caution about overreliance potentially compromising the authenticity of students’ work. While some students feared AI might alter their writing styles, others saw it as a valuable tool for addressing specific writing challenges, particularly in grammar and structure. This optimism supported the view of generative AI as a supplementary tool rather than a replacement.
Using Social Cognitive Theory (SCT), which highlights peer interactions and cultural contexts, helps explain these varied perceptions. The concept of the Zone of Proximal Development (ZPD) underscores the need to avoid overdependence on AI to promote self-directed learning. The balance between leveraging AI and maintaining individual effort is key in this collaborative relationship. Markers’ discussions about integrating generative AI into academic writing revealed a nuanced perspective, reflecting both optimism and caution. The markers recognized generative AI as a valuable educational tool that can motivate students and enhance efficiency, aligning with Social Cognitive Theory (SCT) as a form of scaffolding. However, markers expressed concerns about potential misuse, such as verbatim copying, which underscores the need for students to develop critical thinking and interpretation skills. This cautious stance contrasts with the broader literature that supports AIs role in boosting creativity and learning. While markers agreed on the importance of responsible AI use, the potential risks highlight the need for effective pedagogical strategies to address these concerns and ensure that AI supports rather than undermines academic development.
The research on perceptions of generative AI in Open Distance eLearning (ODeL) reveals a spectrum of views among lecturers, students, and markers. Lecturers saw generative AI as a beneficial supplementary tool but stressed the need for responsible use to avoid shortcuts and maintain essential human effort in writing. Lecturers advocated for a balanced approach where AI aids correction and comparison without replacing independent writing. Students displayed varied attitudes shaped by peer interactions, cultural contexts, and concerns about authenticity. This highlights the necessity for tailored approaches to accommodate different learning preferences and emphasizes the importance of balancing AIs role in the writing process. Markers recognized the potential of generative AI for motivation and efficiency but also caution against misuse and verbatim copying. Markers aligned with the emphasis on responsible AI use and the need for thoughtful integration.
Overall, the findings underline the importance of a context-specific, critical approach to integrating generative AI in education, focusing on trust, responsible use, and ongoing evaluation to enhance academic writing skills.The study has key limitations: Single Module Focus: It examines only one English language module, which may limit its applicability to other subjects or module structures. Exploratory Nature: It does not recommend a specific generative AI system for Distance Education (DE) universities, which might affect the generalizability of the findings. Future research could address these limitations by: Expanding to multiple language modules or disciplines to gain a broader understanding of generative AIs impact. Testing various generative AI systems to assess their effectiveness across different educational contexts. The investigation into generative AI integration within the WRI124 module reveals a nuanced landscape:
-
- Lecturers: Recognized generative AIs benefits but express concerns about misuse. Lecturers advocated for a balanced approach that combines traditional and technological methods, emphasizing responsible AI use.
- Students: Initially skeptical but hopeful about AI support, given effective guidance. Students were open to AIs potential if it is used responsibly and enhances their writing. Markers: Cautious about potential shortcuts but saw generative AI as a valuable educational tool when used purposefully. Students stressed the importance of teaching responsible use and leveraging AI to complement student creativity.
Recommendations:
-
- Implement balanced AI integration that complements traditional methods.
- Develop comprehensive guidelines, training programs, and user-friendly interfaces.
- Encourage active student engagement and provide clear communication about AIs capabilities and limitations.
- Explore AIs potential as a motivational tool to enhance the academic writing experience.
The study highlights the need to address concerns and optimize generative AIs role in education, contributing to discussions on academic authenticity, creativity, and future learning practices.
Mitrano, T. (2023, January 16). Coping with ChatGPT. Inside Higher Ed | Higher Education News, Events and Jobs. https://www.insidehighered.com/blogs/law-policy%E2%80%94and-it/coping-chatgpt
Professor Peter Martin established Cornell’s Law Information Institute. He integrated new digital tools into legal research and writing, transforming assignments that once took hours into tasks completed in minutes. This shift, driven by early electronic search capabilities, marked a significant change in how legal work was approached. With a background in history and computer information science, the author has taught courses related to internet law and policy, now evolving into the culture, law, and politics of information policy. Early on, she avoided multiple-choice tests to prevent academic integrity violations and instead focused on comprehensive take-home essays and class participation. In recent years, as she adapted to teaching in computer information science, she moved away from assignments prone to plagiarism.
Now, with the rise of ChatGPT and similar AI tools, there is a new challenge for academic integrity. While some institutions have banned these tools, it’s impractical to fully block access, and it raises questions about how to maintain academic standards. As she prepared my new course syllabus, she has included a section on AI to address potential integrity issues, drawing inspiration from Martin’s idea that education should focus on how to find and utilize information, rather than just memorizing it. This technological shift presents both a crisis and an opportunity. While some see it as a threat, she views it as a chance to reevaluate teaching methods and align them with current technological realities. If AI tools like ChatGPT can push us away from “teaching to the test,” AI might offer a beneficial change. Her course now includes significant discussion on misinformation, reflecting its growing importance in society’s global and political landscape. Ultimately, we as a society must approach AI with critical inquiry and ethical consideration, ensuring that educational methods foster genuine learning and innovation. This moment calls for a thoughtful integration of technology that enhances rather than hinders the educational experience.
Pazzanese, C (2020, October 26). Ethical concerns mount as AI takes bigger decision-making role. Harvard Gazette. https://news.harvard.edu/gazette/story/2020/10/ethical-concerns--as-ai-takes-bigger-decision-making-role/
Pazzanese asserts that AI has been the focus of STEM research for decades. Most people became aware of AI through Google and Facebook. Today AI is essential in many industries, including health care, banking, retail, and manufacturing. It is said to improve efficiency, bring down costs, and accelerate research and development. However, this has caused fear that it may be more harmful than good. There is no U.S. government oversight. “Private companies use AI software to make determinations about health and medicine, employment, creditworthiness, and even criminal justice without having to answer for how AI is ensuring that programs aren’t encoded, consciously or unconsciously, with structural biases” (Pazzanese, 2020, para 2). Businesses worldwide are expected to spend $110 billion on AI in 2024. Retail and banking spent the most at $5 billion each. IDC expects media and federal and central governments to invest the most between 2018 and 2023. Joseph Fuller, professor of management practice at Harvard Business School says that most major businesses have AI and count AI as essential to their business. Most people assumed that AI would be used for repetitive tasks and be low level decision makers. AI has massively grown in sophistication, including machine learning. This sorts and analyzes massive amounts of data in mere seconds. This has transformed many fields, including education. AI now is used in strategic decision making and product development. Healthcare sees AI as being used in billing and paperwork. However, in the future, it could be used in analysis of data, imaging, and diagnosis.
In employment AI is being used to process resumes and analyze job interviewers’ faces and voices. AI takes on technical tasks like routing for package delivery trucks, which allows workers to focus on other tasks. This means that workers can get more done during the day. Fuller (2020) believes that using AI to fully eliminate job categories is unlikely. He cites that humans are still needed to supplement AI. Karen Mills, who ran the U.S. Small Business Administration from 2009 to 2013, says that small businesses can use AI to give new insights into sales trends, cash flow, ordering, and other important financial information. It is hard for small businesses to get loans because banks struggle to get an accurate picture of a small business. AI will be able to pull large amounts of information about the small business in minutes, therefore getting rid of long delays in loan applications. While using AI should get rid of human bias, it has shown that it upholds the biases held today. There are ethical concerns in privacy and surveillance, bias and discrimination. This highlights the question of human judgment. The author wonders whether computers will outthink us. There is no need to panic about AI bias because there is already human bias in everyday work situations. AI needs to be calibrated properly and deployed thoughtfully to allow more to be considered than was before. Political philosopher Michael Sandel thinks that AI using biases can make the bias seem scientific and put more authority into the biases used. Mills (2020) thinks that redlining could happen again when AI goes through banking data sets. Banks are highly regulated, so they are focusing on their AI systems to make sure that this does not happen. Some think that AI should be highly regulated. However, no one knows what this looks like. So far, companies that use AI self-police.
Businesses are concerned about AI and what that means for business. Businesses already think about potential liability, but they are unable to look at every scenario. Few think the government will ever be up to the job. The rapid rate of technology means that legislators cannot keep up. It would create a huge drag on innovation if every product that uses AI needs to be prescreened for social harm. Jason Furman, a professor of the practice of economic policy at Harvard Kennedy School, believes that the government could get a better understanding of AI. Government could use existing bodies like the National Highway Transportation Safety Association to oversee AI in autonomous vehicles rather than a single AI watchdog agency. Furman thinks that industry specific panels would be able to be more knowledgeable about the overarching technology that AI will only be one piece of. The European Union already has rigorous data privacy laws and is considering regulations for AI. However, the U.S. is usually behind on these types of technology. Furman thinks there needs to be more urgency in the U.S. government. Furman believes that the big tech companies are not self-regulating and big tech companies are not subject to adequate government regulation. The free market will not sort itself out. Sandel thinks that companies need to think about the ethical concerns and the public needs to be more informed on the ethics of new technologies.
Seyyed Kazem Banihashem, Nafiseh Taghizadeh Kerman, Omid Noroozi, Jewoong Moon, & Hendrik Drachsler.(2024). Feedback Sources in Essay Writing: Peer-Generated or AI-Generated Feedback? International Journal of Educational Technology in Higher Education, 21. https://doi-org.corvette.salemstate.edu/10.1186/s41239-024-00455-4
Feedback is widely recognized as a crucial tool for improving learning. It is defined as information given by various sources (e.g., teachers, peers, self, AI, technology) about one’s performance or understanding. Feedback helps students become more aware of their strengths and areas for improvement, offering actionable steps to enhance their performance. Research highlights the positive effects of feedback on learning, including increased motivation, greater engagement, enhanced self-regulation and metacognitive skills, and improved learning outcomes. Traditionally, teachers have been the primary source of feedback, leveraging their expertise to provide insights into students’ performance and understanding. However, the role of teachers in delivering feedback has been challenged by increasing class sizes and the rise of digital technologies. These factors have led to higher workloads for teachers, limiting their ability to offer personalized and timely feedback to each student. To address the challenge of providing personalized feedback in large classes, peer feedback has emerged as a viable solution. In this approach, students provide feedback to their peers rather than relying solely on teachers. Research shows that peer feedback can enhance students’ learning by encouraging deeper engagement, self-regulation, and motivation. Additionally, it can help reduce teachers’ workload by shifting some feedback responsibilities to students, creating a more dynamic and interactive learning environment. While peer feedback offers many benefits, providing high-quality feedback can be challenging. Key issues include: Understanding Feedback Principles: Peers often lack a solid grasp of feedback principles necessary for effective evaluations.
-
- Complexity of Feedback: Giving high-quality feedback requires significant cognitive effort to assess assignments, identify issues, and suggest improvements.
- Domain-Specific Expertise: Effective feedback often needs specialized knowledge, which students may not always have.
These factors make it difficult for students to consistently provide valuable feedback to their peers. Recent advancements in technology and the rise of fields like Learning Analytics (LA) have created new opportunities for improving feedback practices with scalable, timely, and personalized responses. A notable development is the introduction of ChatGPT, an AI tool that has generated significant discussion about its potential impact on education and how it could enhance feedback processes. AI-powered ChatGPT introduces the concept of AI-generated feedback, which has potential benefits for feedback practices. However, the current literature on ChatGPT’s effectiveness in this area is limited and largely non-empirical. As a result, there is a restricted understanding of how well ChatGPT can support feedback practices, including its impact on timeliness, effectiveness, and personalization of feedback. The potential of AI-generated feedback, particularly from ChatGPT, to provide quality feedback remains uncertain. There is a lack of research on how well AI tools like ChatGPT can enhance feedback quality compared to traditional peer feedback.
This research aims to investigate and compare the quality of feedback generated by ChatGPT with that provided by students in the context of essay writing. This study aims to significantly contribute to the literature on AI in education by comparing the quality of AI-generated feedback from ChatGPT with peer-generated feedback. It will highlight ChatGPT’s potential as an effective automated feedback tool and provide insights into how AI can help reduce the feedback workload for teachers. The study focuses on essay writing due to its prevalence and complexity. Essay writing is a common yet challenging task for students, and existing literature shows that students frequently struggle to meet high standards in this area. Thus, examining feedback quality in this context is particularly relevant. Teachers often feel dissatisfied with the depth and quality of students’ essay writing. Teachers frequently find that their feedback is superficial due to the significant time and effort required for thorough assessment and individualized feedback. This limitation hampers their ability to provide in-depth evaluations. Comparing peer-generated feedback and AI-generated feedback in essay writing is valuable for both research and practical application. This study provides insights into the effectiveness of feedback from peers and AI, crucial for improving essay writing quality. It has the potential to alleviate teachers’ workload by reducing the time and effort required for essay evaluation and could enhance the quality of essays through better feedback mechanisms.
The key questions for this study are: “To what extent does the quality of peer-generated and ChatGPT-generated feedback differ in the context of essay writing?” and “Does a relationship exist between the quality of essay writing performance and the quality of feedback generated by peers and ChatGPT?” (Seyyed et. al. 2024). The study took place during the 2022–2023 academic year at a Dutch university specializing in life sciences, involving 74 graduate students from food sciences. Of these participants, 77% were female (57 students) and 23% were male (17 students). This exploratory study, conducted in two phases during the 2022–2023 academic year, aimed to enhance students’ essay writing skills through a peer feedback process within the “Argumentative Essay Writing” (AEW) module on the Brightspace platform. Phase One (week one): Students wrote essays on controversial topics related to their field of study, such as food safety and risk assessment. Students had one week to complete and submit their essays. Phase Two (week two): Students were randomly assigned to provide asynchronous feedback on two peers’ essays using the FeedbackFruits app integrated into Brightspace. The feedback focused on evaluating the argumentation elements of the essays, with specific guidelines and word count requirements provided. The study used FeedbackFruits’ peer feedback feature to facilitate this process, aiming to boost student engagement and collaborative learning.
In addition to peer feedback, the study also incorporated ChatGPT to provide feedback on students’ essays. ChatGPT was given a similar feedback prompt as the students, with a slight modification, to ensure consistency. This prompt asked ChatGPT to assess the essays based on the presentation and justification of argumentative elements, identifying problems and suggesting improvements, with feedback between 250 and 350 words. The study collected data from students’ essays, peer feedback, and ChatGPT-generated feedback. To analyze the quality of the feedback and essays, two coding schemes were applied. In this study, essay quality was assessed using a coding scheme developed by Noroozi et al. (2016), which includes eight key elements of high-quality essay composition. These elements are: introduction, clear stance, supporting arguments, justifications for arguments, counter-arguments, justifications for counter-arguments, responses to counter-arguments, and conclusions with implications. Each element is scored from zero to three, with cumulative scores used to determine overall essay quality. Two experienced coders evaluated the essays using this scheme, achieving a high agreement level of 75% (Cohen’s Kappa = 0.75, p < 0.001), indicating significant consensus between them. The study includes three components:
-
- Affective Component: Measures the emotional tone of the feedback, including positive sentiments like praise and negative emotions like disappointment.
- Cognitive Component: Involves description (summary of the essay), identification (specific issues), and justification (explanations for identified issues).
- Constructive Component: Entails providing recommendations for improvement, though not detailed action plans.
Feedback quality is rated from zero (poor) to two (good). The average score of feedback from peers and ChatGPT was calculated for each essay to determine overall feedback quality. The assessment, conducted by two evaluators, showed high inter-rater reliability at 75% (Cohen’s Kappa = 0.75, p < 0.001), indicating significant agreement between them. To ensure the validity and reliability of the data for statistical analysis, the study implemented two tests: Levene’s Test: Assessed group homogeneity and Kolmogorov-Smirnov Test: Evaluated data normality. Both tests confirmed that the data met the necessary assumptions. For the first research question, gender was used as a control variable, and MANCOVA was applied to compare feedback quality between peer feedback and ChatGPT-generated feedback. For the second research question, Spearman’s correlation was used to examine relationships among original essays, peer feedback, and ChatGPT-generated feedback. The results indicated a significant difference in feedback quality between peer feedback and ChatGPT-generated feedback. Peer feedback was of higher quality compared to ChatGPT’s feedback. This difference was primarily due to the peer feedback’s superior ability to describe and identify issues in the essays.
While ChatGPT provided more extensive descriptive feedback, including summaries and general descriptions, students excelled in pinpointing and identifying specific problems in their feedback. Overall, there was no significant relationship between the quality of essay writing and the feedback from peers or ChatGPT. However, a positive correlation was found between essay quality and the affective component of ChatGPT-generated feedback, meaning that better essays received more positive emotional feedback from ChatGPT. Conversely, a negative relationship was observed between essay quality and the affective component of peer feedback, indicating that higher-quality essays received less emotionally charged feedback from peers. The study explored ChatGPT’s potential as a feedback tool for essay writing and compared its feedback quality with that of peer feedback. The findings showed that peer feedback was of higher quality than ChatGPT-generated feedback. This difference was mainly due to variations in the descriptive and problem-identification aspects of the feedback. ChatGPT’s feedback was often more descriptive, focusing on summarizing the essay content, which reflects its ability to analyze and synthesize textual information. ChatGPT excels in providing extensive descriptive feedback, which is useful for summarizing complex arguments and offering comprehensive overviews of essay structure and coherence. However, students’ feedback is superior in identifying specific issues and areas for improvement. This advantage is likely due to students’ cognitive skills, critical thinking, and contextual understanding, which enable them to spot problems that ChatGPT might miss. The findings reveal that ChatGPT-generated feedback includes all essential components of high-quality feedback—effective, cognitive, and constructive. This suggests that ChatGPT could be a viable feedback source, supported by previous research highlighting the positive impact of AI-generated and automated feedback on educational outcomes.
The results suggest that ChatGPT and peer feedback could complement each other in the feedback process. Combining both sources may enhance overall feedback quality, creating a synergistic effect that improves feedback outcomes. The second research question revealed no significant correlation between essay quality and the quality of feedback from either peers or ChatGPT. This finding suggests that the quality of an essay does not necessarily impact the effectiveness of the feedback provided. This challenges the typical view that higher-quality essays lead to better feedback due to clearer identification of issues or more nuanced evaluation. The study also found that while essay quality had no overall effect on feedback quality, ChatGPT’s feedback showed a positive correlation with essay quality in terms of affective features. ChatGPT provided more positive feedback as essay quality improved. Conversely, peer feedback showed a negative relationship with essay quality in affective terms, possibly reflecting a shift in feedback focus from emotional support to more cognitive and constructive critique as students improve. This indicates that while the general relationship between essay and feedback quality is complex, the affective aspects of feedback may still vary with essay quality, particularly in AI-generated feedback.
The study had several limitations and areas for future research: The study was conducted at a single institution with a small participant pool, limiting the generalizability of the findings. Future research should involve a larger and more diverse participant pool from various institutions and courses. The study did not explore how students used the feedback from peers and ChatGPT in their revision process. Future research should investigate how feedback is incorporated into essay revisions to understand its practical impact. ChatGPT’s responses may vary based on how prompts are phrased. Future studies should carefully control and consider prompt-related factors to ensure comparability of AI-generated feedback with peer feedback. Although no significant inaccuracies were found in ChatGPT’s feedback, future research should address the potential for inaccuracies and explore ways to validate AI-generated feedback. The interrater reliability was 75%, which, while statistically significant, suggests a need for improved precision. Additional coder training could enhance reliability. The study highlights the potential of AI, like ChatGPT, to redefine feedback mechanisms in education. Future research could explore how adaptive learning systems using AI can provide tailored educational experiences and support student engagement. To ensure the reliability and validity of AI feedback, future studies should include prompt validation, blind feedback assessments, and comparative analyses with other AI models. Overall, the findings suggest integrating ChatGPT with peer feedback in educational settings to leverage their complementary strengths and reduce teacher workload, particularly in large online courses. This study addresses a gap in the literature regarding AI-generated feedback for complex tasks like essay writing in higher education. By comparing ChatGPT-generated feedback with peer feedback, the study establishes a foundation for further research on AIs effectiveness in educational feedback. The study provides insights into how ChatGPT could be integrated into feedback practices in higher education. It suggests that ChatGPT’s feedback could complement peer feedback, offering a valuable tool for enhancing feedback in large courses with substantial essay-writing components. This integration could help teachers manage feedback for a larger number of students effectively.
Svanberg, M., Li, W., Fleming, M., Goehring, B., & Thompson, N. (2024). Beyond AI exposure: Which tasks are cost-effective to automate with computer vision? SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4700751
People tend to think that machines will steal jobs during big technological change. This has re-emerged with Large Language Models (e.g. ChatGPT, Bard, GPT-4) because LLMs can do things that were previously only done by humans. A recent study said that LLMs could automate 50% of tasks. This would create a huge disruption to the labor force if done rapidly. If done slowly it is possible that labor could adapt like it has before during other economic transformations. Good policy and business decisions depend on how rapidly this automation will happen. AI is already changing labor demand. “AI Exposure,” which classifies tasks or abilities that can be automated, is what is causing the most anxiety. It is important to note that the authors are vague on the timeline and the extent of automation because they do not take into consideration technical feasibility or economic viability of AI automation. A McKinsey report is the only exception that estimates AI adoption of 4% and 55%. It is hard to reach a conclusion with imprecise predictions. AI exposure models also don’t show a difference between full and partial automation. It is incredibly important to separate the two to come up with a complete picture.
In this paper, the authors show the three shortcomings of AI exposure models and show a more economically grounded prediction of task automation. First, workers who were familiar with end-use tasks were surveyed. Second, the authors model the cost of building the AI systems needed to reach that level of performance. This is essential because building these technology systems is expensive. Third, the authors see if AI adoption is economically attractive. There is a task-based approach to the analysis. First, is it possible to build AI to complete the task? Second, would it be more economically attractive to use AI versus human workers? The authors use the literature to address exposure. The attractiveness of AI versus humans is shown by looking at the cost of each. The authors look at the literature of computer science to look at the relative cost of training and the inference of deep learning, then use 35 case studies, to show the cost of AI. The concept of minimum viable scale is central to the comparison of the cost of AI versus the cost of human labor. This occurs when the AI deployment’s fixed costs are sufficiently amortized that the average cost of using the computer vision system is the same as the cost of human labor of equivalent capability. It is cost effective to use AI automation only when the deployment scale is larger than the minimum viable scale.
There are many tasks that can be carried out by a sufficiently sophisticated computer vision system, while other tasks have little use to these systems. When looking for the cost of AI the authors are looking for AI to have equivalent capabilities to human labor. This is what Brynjolfsson called “Turing Trap” automation. The authors are only mapping some capabilities so there are likely other advantages and disadvantages of using AI. It is assumed that these effects are secondary in the short to medium term. Given that their results remain consistent despite substantial changes in the benefits of the AI systems, these secondary effects would need to be considerable to significantly alter their conclusions. In the first place the authors assume that AI will be as proficient as human workers. But why wouldn’t businesses use AI that is only partially proficient? That is not studied because this paper focused on human replacement. These less capable systems may be considered to supplement human workers, which is considered in other work. The focus of this paper is not on AI that is more capable than human workers, because the AI that is proficient will be more economically viable first. To estimate the cost of a computer vision system, the authors use the methods developed by Thompson et al. (2021, 2022, 2024) to analyze and calculate the individual cost components. The cost for the business to deploy the AI system is broken down into the categories of fixed costs, performance-dependent costs, and scale-dependent costs. Fixed costs can also be called engineering costs and comprises of implementation and maintenance costs. Performance-dependent costs are those that fluctuate according to system requirements, including the cost of data, and the compute cost per training round. Scale dependent costs comprise the running cost of the system.
To estimate the total cost of replacing human labor for a specific task, the authors calculate the net present value of the system’s cost over its projected lifespan. Svanberg et al. (2024) focus on determining which tasks are cost-effective to automate with computer vision by gathering input from domain experts rather than AI specialists. The authors use an online survey to collect data on task performance requirements and associated costs, recruiting respondents from Prolific and using Qualtrics for the survey. Participants choose tasks they are familiar with and provide accuracy and cost estimates, with invalid or unsure responses excluded. The average number of valid responses per task is nine, and for 33 tasks lacking sufficient expertise, participants use the average value from other tasks. Efforts to collect data on application complexity (entropy) from domain experts through surveys were unsuccessful. Instead, the complexity of each task is manually assessed. To address the lack of expert input, the study tests the robustness of findings by varying entropy values. The costs can assume that there are significant engineering costs, but that is not always accurate. To a large extent, the cost of human labor aligns with the marginal cost of compensation per worker.
There is a significant gap between tasks exposed to AI and those considered economically viable for automation. Although 36% of jobs in U.S. non-farm businesses involve tasks that could use computer vision, only 8% of these jobs (23% of the exposed tasks) feature tasks that are economically attractive for firms to automate. The authors have found that vision tasks account for 1.6% of U.S. non-farm compensation, but only 0.4% of this (which represents 23% of the total) is considered attractive for automation with AI. The results are influenced by the cost of deploying AI systems. Tasks with low wages or many different tasks per occupation may not be economically attractive to replace, even with systems costing $1,000. Conversely, high-wage tasks or those with fewer job roles may justify investments in systems costing over $100 million. The graph highlights that system costs significantly impact economic attractiveness and that substantial cost reductions are necessary for a notable increase in tasks deemed suitable for automation. The model of AI automation includes changes to cost parameters, changes to the benefits provided by switching to AI systems, and a “bare bones” scenario that tests when many costs are lower. There is an overview of the cost parameters. The more consequential changes were to required accuracy, data costs, or engineering costs. The share of automation was only increased from 23% to 33% of compensation, at most. The assumption is that AI at similar capacity will produce similar benefits. If the volume of work fluctuates, human workers may be better suited to handle other tasks during periods of low activity.
To examine how automation sensitivity relates to the value generated by an AI system, the authors analyze how the adoption decision would shift if the system, for instance, produced twice the value of a human worker. Empirically, the authors base their estimates on the common economic assumption that workers are compensated according to their marginal productivity. This study only looks at the short to medium term decisions. Long-term results were not studied. In this study, there is an assumption of “bare bones” implementation. This means free data, free compute, and minimal engineering effort. Using these extremely aggressive assumptions, the amount of economically attractive firm-level automation only increases to 49% (0.79% of compensation). This highlights the highly fragmented distribution of tasks in the economy, where even moderate development costs can be prohibitive. This finding suggests that automating many of these tasks would probably require a significant sales and coordination effort, potentially slowing down these initiatives. Frank et al. (2023) studied unemployment risk by calculating the probability of receiving unemployment benefits, using data from each state’s unemployment benefits office. The authors recreate Frank et al.’s (2023) analysis. The measure is aggregated to the share of tasks in that area that are computer vision using their AI exposure variable, and a composite score for how close tasks are from having an economic advantage. The findings of Frank et. al. (2023) are restated and then compared to their predictions. Their method explains 10.9% of variance as compared to less than 3% for most others and 8.9% and 10.7% for the 3 Webb measurements and Arntz et al. AI (2017) could become more economically attractive in 2 ways. There could be more ways for AI to automate more labor per system (deployment scale). AI systems could also become less expensive ways to build AI systems (development cost). Besides firms custom-building their own systems, another approach to achieving the minimum viable scale is by pooling human labor across multiple firms. Although this could theoretically occur if one firm gains market share through improved efficiency, vision tasks represent only a small portion of overall firm costs. Consequently, gaining an advantage in vision tasks is unlikely to create significant competitive differences at the firm level in many sectors.
The belief is that demand for AI solutions is more likely to be AI-as-a-service business models. It is only cost effective to automate less than 1/10th of their existing vision labor at the current cost structure for firms of over 5,000 employees. This is bigger than 99.9% of the firms in the U.S.. McElheran et al. (2023) claims that only fewer than 6% of firms use AI-related technologies. These are disproportionately large firms, representing 18% of employment. Most firms are, and are likely to remain, too small to cost-effectively develop computer vision systems to replace their existing workers. However, if labor costs for a specific task can be pooled across multiple firms, the economics of automation become much more appealing. If AI systems could be deployed on a national scale—handling all instances of a task across the entire economy—then AI would already have an economic advantage for 88% of vision task compensation. There is still the potential for AI-as-a-service to have much greater automation, but it is doubtful to happen from the short to medium run.
Determining whether systems designed for specific tasks can be generalized to the industry level is the technical challenge. Rapidly rising costs have accompanied AI systems increasingly generalizing across tasks. Getting many separate firms onto single platforms is expensive and that becomes the economic challenge of AI-as-a-service. Only partial adoption of platforms is expected. HubSpot Sales blogger Fuchs (2022) claims that closing rates are as low as 20% in Enterprise IT sales. The authors find it unlikely that a third-party vendor would be able to capture more than a fraction of the market. Another obstacle to AI-as-a-service proliferation is access to data. Those creating AI systems need fine-tuning to incorporate knowledge about objects and situations that the system must manage effectively. Data is not always released outside of the company, which is a barrier to AI-as-a-service third party vendors. Governments and regulatory bodies could also make rules on data sharing, which could accelerate or hinder data sharing. If platformization across all vision tasks were to accelerate rapidly (+20% per year), significant automation could occur within the next decade. In contrast, a more gradual rate of platformization (+5% per year) would take decades to realize the full potential for automation. The authors expect AI systems will become less expensive through technological change rather than through platformization. As long as AI requires customization for specific applications (e.g., through fine-tuning), the associated costs will impact its proliferation. Currently, computer vision offers an economic advantage in only 23% of vision tasks at the firm level, and there are obstacles to AI-as-a-service deployments. Therefore, a significant reduction in costs will likely be necessary for computer vision to effectively replace human labor.
Even with a 50% annual reduction in costs, it will not be until 2026 that half of the vision tasks will have a machine economic advantage. By 2042, there will still be tasks where human labor remains more advantageous despite the use of computer vision. With a 10% annual decrease in system costs, computer vision’s market penetration will remain below half of the exposed task compensation by 2042. Moore’s law is why Ford (2015) suggests that costs will decrease rapidly. The annual cost reduction in GPU computing was measured by Thompson et al. (2020) and Erdil and Besiroglu (2023). The latest estimate of a 22% annual cost decrease provided by Hobbhahn and Besiroglu (2022) is used in this study. The costs of data and engineering will likely decrease by not as predictably. Data will become cheaper due to increasing digitization. The engineering team might reduce costs by improving developer tools and the spread of machine learning engineering expertise.
The need for fine-tuning might decrease when fondation models improve. If there is a paradigm shift in how those who create AI systems fine-tune models that will decrease costs. Cost components will become less significant to the overall pace of advancement when the components improve rapidly. Worker displacement, retraining, social support, and other interventions are important to AI policy discussions. Acemoglu et al. (2022) identify correlations between AI hiring plans and hiring trends in other areas. The authors document a sharp increase in AI-related hiring starting in 2010, with a significant acceleration around 2015-16. Their findings indicate that establishments exposed to AI reduce their hiring in both non-AI roles and overall at the establishment level. Grennan and Michaely (2020) show that security analysts with high exposure to AI are more likely to exit the profession, and those who leave tend to move into non-research roles that require management and social skills.
Data from the U.S. Census Bureau (2021) indicate that, on average, 11% of jobs in private sector establishments were lost annually between 2017 and 2019. Despite this, significant job creation led to a net increase of approximately 1.6% over the same period. In the labor market, 23% of vision compensation tasks have become attractive to automate recently. Automation attempts should be scaling up. In the years following the initial wave of automation, the rate of incremental automation drops significantly below the existing job destruction rate. The percentage of vision task compensation that is lost every year will be 6-8% in the peak years when there is a 50% annual computer vision cost decrease and if the authors assume that all vision tasks where machines achieve an economic advantage at the firm level are automated within the same year. Task automation driven by the platformization of AI will only briefly exceed, if it does at all, the overall job destruction rate. AI automation is expected to be smaller than the existing job automation and destruction effects already observed in the economy.
It is unclear whether adding AI automation will substantially increase overall job destruction. Some increase in automation is anticipated, but it is also likely that a significant portion of AI task automation will occur in areas where traditional automation is already taking place. As a result, these two types of automation will partially substitute for each other, leading to a net effect that is less than their combined total. Foundational models are defined as deep learning models that are “trained on broad data at scale and are adaptable to a wide range of downstream tasks” by Bommasani et al. (2021). The presence of foundation models does not diminish the approach used in this paper; rather, the economic model for predicting computer vision costs assumes the existence of a foundation model for fine-tuning (Thompson et al. 2024). Their cost estimates, though sometimes significant, arise from adapting pre-existing models to specific tasks. Foundation models could affect their results if improvements in these models allow tasks currently performed by workers to be increasingly replaced by the models without the need for fine-tuning. This would lower implementation costs and make automation more economically attractive. Additionally, better foundation models might reduce the amount of fine-tuning data required, leading to a smaller yet still significant cost reduction. Thus, advancements in foundation models play a role in decreasing overall costs.
Data availability and slowing progress in foundation models make it unlikely that foundation models will be able to entirely displace specialized models. Since many human vision tasks are not monitored by cameras, and because data from tasks tracked by cameras might be sensitive or proprietary, it is likely that much of this data will not be shared with the creators of foundation models. This limited data availability will hinder the ability of computer vision systems to generalize effectively. Thompson et al. (2020) finds that costs for vision models escalate rapidly into billions of dollars to even improve vision models by small increments. Lohn (2023) shows that there is already evidence of such slowing. There are a few notable differences to their AI model and generative language setting. There is more extensive availability of language data for training foundation models compared to image data. lLanguage models are also better at utilizing unlabeled data (LeCunn and Misra 2021). Nevertheless, even with this advantage, challenges remain due to the need for specialized knowledge, meaning fine-tuning will still be necessary for tasks such as understanding specific product information. An important area for future research will be quantifying the proportion of language tasks that can be easily automated without requiring fine-tuning. There is also a difference in the cost of data. Firms frequently possess large amounts of text data, which is often easier to collect than photos or videos.
This paper focuses on task automation. This provides only a partial perspective on AI adoption, as AI can also be used to augment human labor or create entirely new products. According to a survey, 83% of executives believe AI will enhance human labor rather than replace it (IBM Institute for Business Value 2023, p.8). Bessen et al. (2018) found that only 50% of AI startups help customers reduce labor costs, while 98% focus on developing products that enhance capabilities. Although this article does not address augmentation, the authors are exploring it in other work. New tasks that are created as a consequence of the roll-out of AI are not considered. The work by Thompson et al. (2024) is heavily relied upon. This predicts the cost of developing a computer vision system ahead of time. There are two important limitations. The first issue is that their model’s input data includes no data points with accuracies exceeding 95%. This means that for approximately 40% of vision tasks requiring higher accuracy, the authors are extrapolating based on their cost function. Since these extrapolations follow power laws, they are highly sensitive to variations in the estimated coefficient. Underscoring the importance of future research on scaling laws have more substantial effects on their estimates and can provide more precise estimates. The second limitation is that their analysis relies on a limited number of models for transfer learning. Using a larger foundation model with higher initial accuracy could potentially conserve resources. Although one might expect that a model pre-trained closer to the target domain would perform better, their findings surprisingly show only a limited impact of data distance. Respondents with expertise are identified by online survey. Information is collected about the required accuracy and the cost per data point for each task. The respondents lack the knowledge of AI which limits the precise information needed for the analysis. Respondents might only have a general sense of how frequently mistakes are permissible when performing the task, and respondents could inadvertently include errors from other tasks conducted alongside the defined vision tasks when reporting error rates.
To mitigate these issues, the authors designed survey questions to be straightforward and then infer the necessary information for their research. This approach is a compromise compared to the ideal scenario where the authors could find experts knowledgeable in both AI and the specifics of each of the 400+ tasks. Using ONET data is a limitation. ONET was created as a comprehensive dictionary of occupations in the United States to support various purposes related to understanding work. However, there are notable discrepancies between ONET and their method, and data tailored specifically to their needs would differ significantly. One key mismatch is that ONET-Task and DWA combinations, which the authors refer to simply as tasks, may not be ideal units for automation. Tasks may not always be separable or fully automatable by computer vision. Furthermore, it is uncertain whether tasks are sufficiently similar across different firms and NAICS codes to be standardized and automated on a large scale. Conversely, ONET’s task categories might be too narrow, and tasks from different occupations could potentially be automated by a single system. Assigning value to automated tasks is limited even if ONET tasks are perfect units of automation. Determining whether an employee’s compensation is based on the time spent on a task, the skill required, or the effort involved is challenging, especially when considering the scale of all occupations and tasks in the U.S. economy. The relationship between the value of a worker’s skills and the tasks the worker performs is not clearly defined (Autor and Handel 2013). O*NET-Task-Importance was used.
Additionally, even when tasks are separable, the relationship between automation and worker compensation is complex and non-linear. The impact of automation on skill demand and wages varies depending on the type of automation, a complexity that their framework addresses in an abstract manner (Combemale et al. 2022). O*NET-Task and DWA combinations were chosen manually. Possible vision tasks may have been eliminated when filtering the DWA’s. However, it is believed that the authors have a low number of false positives because they validated the list of 414 vision tasks for feasibility. The model assumes complete substitutability of labor with capital, which, while reasonable in many cases, may not apply universally. Tasks are identified where computer vision could potentially replace human labor but cannot determine whether the technology will automate or augment work. Some applications, such as automating sorting processes or quality checks at the end of an assembly line, are likely to reduce labor needs directly, while others may enhance human productivity in different areas. The difference in tax rate was not considered. According to Acemoglu et al. (2020) the tax rates favor automation. The impact on their overall results would be relatively minor even with a 25–30% tax on labor. If a system can only be used at the firm-level 77% of vision tasks are not economical to automate. There will be a change over time in the cost of automation and there will be the potential to increase automation. The results suggest a significantly different trajectory for AI automation compared to previous literature—one that aligns more closely with traditional job turnover and is more receptive to conventional policy interventions. In this scenario, the cost-effectiveness of systems plays a crucial role in determining their adoption.
Tu, Y.-F., & Hwang, G.-J. (2023). University students’ conceptions of CHATGPT-supported learning: A drawing and epistemic network analysis. Interactive Learning Environments, 1–25. https://doi.org/10.1080/10494820.2023.2286370
ChatGPT has significantly impacted the field of education by fostering a personalized and interactive learning environment. Numerous universities are exploring how ChatGPT could transform educational models in higher education. Research indicates that its features, including controllability, encouragement, and the integration of diverse types of knowledge, offer notable advantages for classroom use. Yan’s (2023) study showed that AI could help students learn how to write in a second language. In higher education, learner-centered education highlights the importance of learners’ perceptions in influencing their motivation, learning strategies, attitudes, and overall effectiveness. When designing learning content with ChatGPT, it is essential to consider these factors. Research has examined the relationship between teachers and AI, focusing on how each can complement the other in education. The study highlighted the importance of teachers’ pedagogical expertise when integrating AI tools into the learning process. Firat (2023) identified five ways in which ChatGPT-supported learning enhances self-directed learning experiences: personalized support, immediate feedback and guidance, improved accessibility to learning resources, flexible and convenient learning options, and better utilization of open educational resources.
Researchers have identified a positive relationship between learners’ attitudes, motivation, and performance. Other studies suggest that learning attitudes can influence both motivation and performance. While much of the research has focused on teachers and scientists, understanding the perspectives of students, particularly university students, is crucial. Strzelecki’s (2023) research explored students’ behavior, intentions, and use patterns, employing a drawing technique for non-verbal expression. However, these studies have limitations, such as simply coding and comparing conceptions from students’ drawings without exploring the relationships between them. Gaining insight into learners’ conceptions can help teachers adjust their strategies to better meet student needs. Students view technology as complex and multifaceted, and ChatGPT has significantly impacted education by encouraging critical engagement with information, facilitating knowledge integration, and offering personalized learning experiences. ChatGPT-supported learning offers students personalized and self-directed learning opportunities. Arif et al. (2023) proposed that ChatGPT could serve as a teaching assistant. One limitation is that it cannot conduct a literature review without relying on biased or inaccurate information. (As of now, it can conduct a literature review without inaccurate information, though it still can.
Also, the machine will continue to do better and better.) Additionally, there are legal concerns associated with its use in academic settings. Farrokhnia et al. (2023) suggested that ChatGPT might undermine students’ critical thinking skills and could potentially lead to increased plagiarism. This highlights the need for guidelines governing AI in education as this does not have to be the case when using AI. In research on AI learning, surveys have been commonly used, which limits the depth of understanding that can be achieved on the subject. In the current study, 63 university students from northern Taiwan were recruited. These students had enrolled in the information literacy project and completed at least two online courses: “Generative AI” and “The Principles and Practices of ChatGPT,” indicating their familiarity with AI and ChatGPT principles. Participants were asked to spend approximately 50 minutes completing a questionnaire and a drawing task. After finishing the two online courses, the students filled out a questionnaire that included an open-ended question designed to explore their attitudes toward learning with AI. The precise wording of this open-ended question was as follows: “If you are not sure about learning content, would you search for the information through Google or ChatGPT? Why?”
Secondly, participants were asked to draw on A4 paper to illustrate their conceptions of ChatGPT-supported learning. When describing how ChatGPT could facilitate learning, most students’ drawings focused on the learners themselves. The majority of participants depicted “learners” (83.33%), followed by “no human” (15.00%), “teachers” (6.67%), and “robots” (5.00%). The most common words in their captions were knowledge, assignments, information, data, organization, topic, quick, time, answers, and teachers. The results indicated that students’ conceptions predominantly centered on the learners themselves, learning activities (like discussions and consultations) without a designated location, specific learning content, and positive emotions and attitudes. This suggests that most university students view ChatGPT as a tool for flexible learning in an undefined setting and maintain positive emotions and attitudes towards it. Their conceptions largely focused on the learners and positive feelings. This implies that, in AI-assisted learning contexts such as ChatGPT, university students see themselves as the primary learners, while teachers and robots are viewed as secondary. Students perceived ChatGPT as either a tutor or a tool. The results indicated that the students primarily used ChatGPT for the lower levels of Bloom’s Taxonomy, including remembering, understanding, and applying. The study has several limitations. Firstly, the sample size was small and restricted to university students, which may affect the generalizability of the findings to other populations. Secondly, data were collected from students enrolled in an information literacy project at a single university, so the results may not be representative of all university students in Taiwan.
References
Bassett, C. (2022). The Construct Editor: Tweaking with Jane, Writing with Ted, Editing with an AI? Textual Cultures, 15(1), 155–160. https://www.jstor.org/stable/48687521
Clark, D. (2024, August 9). Does Derrida’s View of Language help us understand GenerativeAI? Donald Clark Plan B. https://donaldclarkplanb.blogspot.com/2024/08/does-derridas-view-of-language-help-us.html
Deans, T., Praver, N., & Solod, A. (2024, February 26). Ai in the writing center: Small steps and scenarios. Another Word. https://dept.writing.wisc.edu/blog/ai-wc/
El-Kishky et. al. (2024, September 12). Learning to reason with LLMS. https://openai.com/index/learning-to-reason-with-llms
Escalante, J., Pack, A., & Barrett, A. (2023). Ai-generated feedback on writing: Insights into efficacy and ENL student preference. International Journal of Educational Technology in Higher Education, 20(1). https://doi.org/10.1186/s41239-023-00425-2
Kahn, J. (2024). AI Isn’t Coming for Your Job—At Least Not Yet. Fortune, 189(2), 17–18.
KU: Center for Teaching Excellence. (2024, July). Ethical use of AI in writing assignments. Ethical use of AI in writing assignments | Center for Teaching Excellence. https://cte.ku.edu/ethical-use-ai-writing-assignments
Lester, D. (2024). Tutors’ Column: GenAI in the Writing Center. WLN: A Journal of Writing Center Scholarship , 48(3). htps://doi.org/10.37514/WLN-J.2024.48.3.05
Manyika, J. (2022). Getting AI Right: Introductory Notes on AI & Society. Daedalus, 151(2), 5–27. https://www.jstor.org/stable/48662023
Maphoto, K. B., Sevnarayan, K., Mohale, N. E., Suliman, Z., Ntsopi, T. J., & Mokoena, D. (2024). Advancing students’ academic excellence in distance education: Exploring the potential of Generative AI integration to improve academic writing skills. Open Praxis, 16(2), 142–159. https://doi.org/10.55982/openpraxis.16.2.649
Mitrano, T. (2023, January 16). Coping with ChatGPT. Inside Higher Ed | Higher Education News, Events and Jobs. https://www.insidehighered.com/blogs/law-policy%E2%80%94and-it/coping-chatgpt
Pazzanese, C (2020, October 26). Ethical concerns mount as AI takes bigger decision-making role.Harvard Gazette. https://news.harvard.edu/gazette/story/2020/10/ethical-concerns--as-ai-takes-bigger-decision-making-role/
Seyyed Kazem Banihashem, Nafiseh Taghizadeh Kerman, Omid Noroozi, Jewoong Moon, & Hendrik Drachsler.(2024). Feedback Sources in Essay Writing: Peer-Generated or AI-Generated Feedback? International Journal of Educational Technology in Higher Education, 21. https://doi-org.corvette.salemstate.edu/10.1186/s41239-024-00455-4
Svanberg, M., Li, W., Fleming, M., Goehring, B., & Thompson, N. (2024). Beyond AI exposure: Which tasks are cost-effective to automate with computer vision? SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4700751
Tu, Y.-F., & Hwang, G.-J. (2023). University students’ conceptions of CHATGPT-supported learning: A drawing and epistemic network analysis. Interactive Learning Environments, 1–25. https://doi.org/10.1080/10494820.2023.2286370